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Preface 



The sixth Algorithmic Number Theory Symposium was held at the University 
of Vermont, in Burlington, from 13-18 June 2004. The organization was a joint 
effort of number theorists from around the world. 

There were four invited talks at ANTS VI, by Dan Bernstein of the Univer- 
sity of Illinois at Chicago, Kiran Kedlaya of MIT, Alice Silverberg of Ohio State 
University, and Mark Watkins of Pennsylvania State University. Thirty contri- 
buted talks were presented, and a poster session was held. This volume contains 
the written versions of the contributed talks and three of the four invited talks. 
(Not included is the talk by Dan Bernstein.) 

ANTS in Burlington is the sixth in a series that began with ANTS I in 1994 
at Cornell University, Ithaca, New York, USA and continued at Universite Bor- 
deaux I, Bordeaux, France (1996), Reed College, Portland, Oregon, USA (1998), 
the University of Leiden, Leiden, The Netherlands (2000), and the University 
of Sydney, Sydney, Australia (2002). The proceedings have been published as 
volumes 877, 1122, 1423, 1838, and 2369 of Springer- Verlag’s Lecture Notes in 
Computer Science series. 

The organizers of the 2004 ANTS conference express their special gratitude 
and thanks to John Cannon and Joe Buhler for invaluable behind-the-scenes 
advice. 
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Computing Zeta Functions via p-Adic 
Cohomology 



Kiran S. Kedlaya* 

Department of Mathematics 
Massachusetts Institute of Technology 
77 Massachusetts Avenue 
Cambridge, MA 02139 
kedlayaOmath .mit.edu 
http : //math.mit . edu/ "kedlaya/ 



Abstract. We survey some recent applications of p-adic cohomology to 
machine computation of zeta functions of algebraic varieties over finite 
fields of small characteristic, and suggest some new avenues for further 
exploration. 



1 Introduction 



1.1 The Zeta Function Problem 



For X an algebraic variety over (where we write q = for p prime) , the zeta 
function 



Z(A,t)=exp 






is a rational function of t. This fact, the first of the celebrated Weil Conjectures, 
follows from Dwork’s proof using p-adic analysis [12], or from the properties of 
etale (Cadic) cohomology (see [14] for an introduction). 

In recent years, the algorithmic problem of determining Z{X, t) from defining 
equations of X has come into prominence, primarily due to its relevance in 
cryptography. Namely, to perform cryptographic functions using the Jacobian 
group of a curve over Fg, one must first compute the order of said group, and this 
is easily retrieved from the zeta function of the curve (as <5(1), where Q{t) is as 
defined below). However, the problem is also connected with other applications 
of algebraic curves (e.g., coding theory) and with other computational problems 
in number theory (e.g., determining Fourier coefficients of modular forms). 

Even if one restricts X to being a curve of genus g, in which case 



Z{X,t) 



Q{t) 



* Thanks to Michael Harrison, Joe Suzuki, and Fre Vercauteren for helpfnl comments, 
and to David Savitt for carefully reading an early version of this paper. 
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with Q{t) a polynomial over Z of degree 2g, there is no algorithm known^ for com- 
puting Z(X, t) which is polynomial in the full input size, i.e., in g, n, and log(p). 
However, if one allows polynomial dependence in p rather than its logarithm, 
then one can obtain a polynomial time algorithm using Dwork’s techniques, as 
shown by Lauder and Wan [30]. The purpose of this paper is to illustrate how 
these ideas can be converted into more practical algorithms in many cases. 

This paper has a different purpose in mind than most prior and current work 
on computing zeta functions, which has been oriented towards low-genus curves 
over large fields (e.g., elliptic curves of “cryptographic size”). This problem is 
well under control now; however, we are much less adept at handling curves of 
high genus or higher dimensional varieties over small fields. It is in this arena that 
p-adic methods shoud prove especially valuable; our hope is for this paper, which 
mostly surves known algorithmic results on curves, to serve as a springboard for 
higher- genus and higher-dimensional investigations. 

1.2 The Approach via p-Adic Cohomology 

Historically, although Dwork’s proof predated the advent of £-adic cohomology, 
it was soon overtaken as a theoretical tool^ by the approach favored by the 
Grothendieck school, in which context the Weil conjectures were ultimately re- 
solved by Deligne [8]. The purpose of this paper is to show that by contrast, from 
an algorithmic point of view, “Dworkian” p-adic methods prove to be much more 
useful. 

A useful analogy is the relationship between topological and algebraic de 
Rham cohomology of varieties over C. While the topological cohomology is more 
convenient for proving basic structural results, computations are often more con- 
venient in the de Rham setting, since it is so closely linked to defining equations. 
The analogy is more than just suggestive: the p-adic constructions we have in 
mind are variants of and closely related to algebraic de Rham cohomology, from 
which they inherit some computability. 

1.3 Other Computational Approaches 

There are several other widely used approaches for computing zeta functions; for 
completeness, we briefly review these and compare them with the cohomological 
point of view. 

The method of Schoof [45] (studied later by Pila [40] and Adleman-Huang [1]) 
is to compute the zeta function modulo £ for various small primes i, then apply 
bounds on the coefficients of the zeta function plus the Chinese remainder theo- 
rem. This loosely corresponds to computing in Gadic and not p-adic cohomology. 

^ That is, unless one resorts to quantum computation: one can imitate Shor’s quantum 
factoring algorithm to compute the order of the Jacobian over F^n for n up to about 
2g, and then recover Z{X,t). See [26]. 

^ The gap has been narrowed recently by the work of Berthelot and others; for instance, 
in [25], one recovers the Weil conjectures by imitating Deligne’s work using p-adic 
tools. 
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This has the benefit of working well even in large characteristic; on the downside, 
one can only treat curves, where £-adic cohomology can be reinterpreted in terms 
of Jacobian varieties, and moreover, one must work with the Jacobians rather 
concretely (to extract division polynomials), which is algorithmically unwieldy. 
In practice, Schoof’s method has only been deployed in genus 1 (by Schoof’s 
original work, using improvements by Atkin, Elkies, Couveignes-Morain, etc.) 
and genus 2 (by work of Gaudry and Harley [17], with improvements by Gaudry 
and Schost [18]). 

A more p-adic approach was given by Satoh [43], based on iteratively com- 
puting the Serre-Tate canonical lift [46] of an ordinary abelian variety, where 
one can read off the zeta function from the action of Frobenius on the tangent 
space at the origin. A related idea, due to Mestre, is to compute “p-adic periods” 
using a variant of the classical AGM iteration for computing elliptic integrals. 
This method has been used to set records for zeta function computations in 
characteristic 2 (e.g., [33]). The method extends in principle to higher character- 
istic [27] and genus (see [41], [42] for the genus 3 nonhyperelliptic case), but it 
seems difficult to avoid exponential dependence on genus and practical hangups 
in handling not-so-small characteristics. 

We summarize the comparison between these approaches in the following 
table. (The informal comparison in the n column is based on the case of elliptic 
curves of a fixed small characteristic.) 



Table 1. Comparison of strategies for computing zeta functions 



Algorithm class 


Applicability 


P 


Dependence or 
n 


i: 

9 


School 


curves 


polylog 


big polynomial 


at least exponential 


Canonical lift /AGM 


curves 


polynomial 


small polynomial 


at least exponential 


p-adic cohomology 


general 


nearly linear 


medium polynomial 


polynomial 



2 Some p-Adic Cohomology 

In this section, we briefly describe some constructions of p-adic cohomology, 
amplifying the earlier remark that it strongly resembles algebraic de Rham co- 
homology. 



2.1 Algebraic de Rham Cohomology 

We start by recalling how algebraic de Rham cohomology is constructed. First 
suppose X = Spec A is a smooth affine variety^ over a field K of characteristic 
zero. Let module of Kahler differentials, and put ~ 

® By “variety over K" we always mean a separated, finite type A-scheme. 
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these are finitely generated locally free A-modules since X is smooth. By a 
theorem of Grothendieck [21], the cohomology of the complex is finite 

dimensional. 

If X is smooth but not necessarily affine, one has similar results on the sheaf 
level. That is, the hypercohomology of the complex formed by the sheaves of 
differentials is finite dimensional. In fact, Grothendieck proves his theorem first 
when X is smooth and proper, where the result follows by a comparison theorem 
to topological cohomology (via Serre’s GAGA theorem), then uses resolution of 
singularities to deduce the general case. 

For general A, one can no longer use the modules of differentials, as they 
fail to be coherent. Instead, following Hartshorne [22], one (locally) embeds X 
into a smooth scheme Y , and computes de Rham cohomology on the formal 
completion of Y along X. 

As one might expect from the above discussion, it is easiest to compute 
algebraic de Rham cohomology on a variety X if one is given a good compacti- 
fication X, i.e., a smooth proper variety such that A \ A is a normal crossings 
divisor. Even absent that, one can still make some headway by computing with 
2?-modules (where T> is a suitable ring of differential operators), as shown by 
Oaku, Takayama, Walther, et al. (see for instance [52]). 

2.2 Monsky- Washnitzer Cohomology 

We cannot sensibly work with de Rham cohomology directly in characteristic p, 
because any derivation will kill p-th powers and so the cohomology will not typi- 
cally be finite dimensional. Monsky and Washnitzer [38], [36], [37] (see also [49]) 
introduced a p-adic cohomology which imitates algebraic de Rham cohomology 
by lifting the varieties in question to characteristic zero in a careful way. 

Let A = Spec A be a smooth affine variety over a finite field Fg with q = p", 
and let W be the ring of Witt vectors over F^, i.e., the unramified extension of 
Zp with residue field F^. By a theorem of Elkik [13], we can find a smooth affine 
scheme A over W such that A Xw Fg = A. While A is not determined by A, 
we can “complete along the special fibre” to get something more closely bound 
to A. 

Write A = Spec A and let Al be the weak completion of A, which is the 
smallest subring containing A of the p-adic completion of A which is p-adically 
saturated (i.e., if px G Al, then x G Al) and closed under the formation of series 
of the form 

E n- ■ 

'^1 ) • • •f'i'm ^0 

with G W and Xi,...,Xm G pAb We call Al the (integral) dagger 

algebra associated to A; it is determined by A, but only up to noncanonical 
isomorphism. 

In practice, one can describe the weak completion a bit more concretely, as 
in the following example. 
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Lemma 1. The weak completion of W[ti , . . . , is the ring W{ti, . . . , tn)^ of 
power series over W which converge for t\, . . . ,tn within the disc (in the integral 
closure of W ) around 0 of some radius greater than 1. 

In general, is always a quotient of FF(ti, . . . , tn)^ for some n. 

We quickly sketch a proof of this lemma. On one hand, W {t\, . . . ,tn)^ 
(which is clearly p-adically saturated) is weakly complete: if C 

pW (ti, . . . , tn)\ then for ti, . . . ,tn in some disc of radius strictly greater than 
1, the series defining xi, . . . ,Xm converge to limits of norm less than 1, and so 
■ ■ ■ ■ ^rn converges on the same disc. On the other hand, any element 

of W(ti, . . . ,t„)i has the form 

E p . . f A . fjn 

where + o(ji + • • • + j„) > —b for some a, b with a > 0 (but no 

uniform choice of a, b is possible). We may as well assume that 1/a is an integer, 
and that 6 > 0 (since the weak completion is saturated). Then it is possible to 
write this series as 



E /->. . 

'^1 ) • • • ^0 

where the x’s run over p^tk for j = 1, . . . , 1/a and k = 1, . . . , n; hence it lies in 
the weak completion. 

The module of continuous differentials over can be constructed as follows: 
given a surjection W{ti, . . . ,tn)^ — l A^ , is the Al-module generated by 
dti , . . . , dtn modulo enough relations to obtain a well-defined derivation d : — >■ 

satisfying the rule 

J i=l jl,... Jn>0 

Then the Monsky- Washnitzer cohomology (or MW-cohomology) of X 

is the cohomology of the “de Rham complex” 

•••—>■ TTjip ■ ■ ■ , 

p 

where Implicit in this definition is the highly nontrivial fact 

that this cohomology is independent of all of the choices made. Moreover, if 
X ^ Y is a morphism of F^-varieties, and and are corresponding dagger 
algebras, then the morphism lifts to a ring map — >■ , and the induced maps 

^Mw(^) depend on the choice of the ring map. The way 

this works (see [38] for the calculation) is that there are canonical homotopies 
(in the homological algebra sense) between any two such maps, on the level of 
the de Rham complexes. 
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MW-cohomology is always finite dimensional over this follows from 

the analogous statement in rigid cohomology (see [5]). Moreover, it admits an 
analogue of the Lefschetz trace formula for Frobenius: if X is purely of dimension 
d, and F ■. ^ is a ring map lifting the g-power Frobenius map, then for 

all m > 0, 

d 

#X(F,™) = ^(-l)*Trace(<z"’”F— 

i=0 

This makes it possible in principle, and ultimately in practice, to compute zeta 
functions by computing the action of Frobenius on MW-cohomology. 

2.3 Rigid Cohomology 

As in the algebraic de Rham setting, it is best to view Monsky-Washnitzer 
cohomology in the context of a theory not limited to affine varieties. This context 
is provided by Berthelot’s rigid cohomology; since we won’t compute directly on 
this theory, we only describe it briefly. See [4] or [19, Chapter 4] for a somewhat 
more detailed introduction.^ 

Suppose X is an F^-variety which is the complement of a divisor in a smooth 
proper Y which lifts to a smooth proper formal IF-scheme. Then this lift gives 
rise to a rigid analytic space via Raynaud’s “generic fibre” construction 
(its points are the subschemes of the lift which are integral and finite flat over 
W). This space comes with a specialization map to Y, and the inverse image 
of X is denoted ]A[ and called the tube of X. The rigid cohomology of X is 
the (coherent) cohomology of the direct limit of the de Rham complexes over 
all “strict neighborhoods” of ]X[ in (Within T™, ]X[ is the locus where 
certain functions take p-adic absolute values less than or equal to 1; to get a 
strict neighborhood, allow their absolute values to be less than or equal to 1 -I- e 
for some e > 0.) 

For general X, we can do the above locally (e.g., on affines) and compute 
hypercohomology via the usual spectral sequence; while the construction above 
does not sheafify, the complexes involves can be glued “up to homotopy” , which 
is enough to assemble the hypercohomology spectral sequence. 

For our purposes, the relevance of rigid cohomology is twofold. On one hand, 
it coincides with Monsky-Washnitzer cohomology for X affine. On the other 
hand, it is related to algebraic de Rham cohomology via the following theo- 
rem. (This follows, for instance, from the comparison theorems of [5] plus the 
comparison theorem between crystalline and de Rham cohomology from [3].) 

Theorem 1. Let Y he a smooth proper W -scheme, let Z C Y be a relative nor- 
mal crossings divisor, and set X = Y\Z . Then there is a canonical isomorphism 

W^^{X xw (Frac VF)) ^ Hb^{X x^/ F,). 

^ We confess that a presentation at the level of detail we would like does not appear 
in print anywhere. Alas, these proceedings are not the appropriate venue to correct 
this! 
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In particular, if X is affine in this situation, its Monsky- Washnitzer cohomology 
is finite dimensional and all of the relations are explained by relations among 
algebraic forms, i.e., relations of finite length. This makes it much easier to 
construct “reduction algorithms”, such as those described in the next section. 

One also has a comparison theorem between rigid cohomology and crystalline 
cohomology, a p-adic cohomology built in a more “Grothendieckian” manner. 
While crystalline cohomology only behaves well for smooth proper varieties, it 
has the virtue of being an integral theory. Thus the comparison to rigid coho- 
mology equips the latter with a canonical integral structure. By repeating this 
argument in the context of log-geometry, one also obtains a canonical integral 
structure in the setting of Theorem 1; this is sometimes useful in computations. 

3 Hyperelliptic Curves in Odd Characteristic 

The first^ class of varieties where p-adic cohomology was demonstrated to be 
useful for numerical computations is the class of hyperelliptic curves in odd 
characteristic, which we considered in [24]. In this section, we summarize the 
key features of the computation, which should serve as a prototype for more 
general considerations. 

3.1 Overview 

An overview of the computation may prove helpful to start with. The idea is 
to compute the action of Frobenius on the MW-cohomology of an affine hyper- 
elliptic curve, and use the Lefschetz trace formula to recover the zeta function. 
Of course we cannot compute exactly with infinite series of p-adic numbers, so 
the computation will be truncated in both the series and p-adic directions, but 
we arrange to keep enough precision at the end to uniquely determine the zeta 
function. 

Besides worrying about precision, carrying out this program requires making 
algorithmic two features of the Monsky- Washnitzer construction. 

— We must be able to compute a Frobenius lift on a dagger algebra. 

— We must be able to identify differentials forming a basis of the relevant 
cohomology space, and to “reduce” an arbitrary differential to a linear com- 
bination of the basis differentials plus an exact differential. 

3.2 The Dagger Algebra and the Ftobenius Lift 

Suppose that p yf 2, and let X be the hyperelliptic curve of genus g given by the 
affine equation 

= P{x) 

® Although this seems to be the first overt use of MW-cohomology for numerically 
computing zeta functions in the literature, it is prefigured by work of Kato and 
Lubkin [23]. Also, similar computations appear in more theoretical settings, such as 
Gross’s work on companion forms [20]. 
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with P{x) monic of degree 2^ + 1 over with no repeated roots; in particular, X 
has a rational Weierstrass point® at infinity. Let X be the affine curve obtained 
from X by removing all of the Weierstrass points, i.e., the point at infinity and 
the zeroes of y. 

Choose a lift P{x) of P{x) to a monic polynomial of degree 2g + 1 over W. 
Then the dagger algebra corresponding to X is given by 

W{x,y,z)'< - P{x),yz - 1), 

whose elements can be expressed as ^ith Ai{x) £ W[x\, deg(Ai) < 

2g, and Vp{Ai) + c|t| > d for some constants c, d with c > 0. 

The dagger algebra admits a p-power Frobenius lift a given by 

X ^ x'^ 

) l/2 

’ 

which can be computed by a Newton iteration. Here is where the removal of 
the Weierstrass points come in handy; the simple definition of a above clearly 
requires inverting P{x), or equivalently y. It is possible to compute a Frobenius 
lift on the dagger algebra of the full affine curve (namely W {x, p) V(j/^ ~ 
but this requires solving for the images of both x and y, using a cumbersome 
two-variable Newton iteration. 




3.3 Reduction in Cohomology 

The hyperelliptic curve defined by y^ = P{x), minus its Weierstrass points, 
forms a lift X of X of the type described in Theorem 1, so its algebraic de Rham 
cohomology coincides with the MW-cohomology That is, the latter is 

generated by 

x^dx , „ „ . , x^dx , . „ „ , 

(z = 0,...,25-1), — ^ (z = 0, ...,2p) 

y y^ 

and it is enough to consider “algebraic” relations. Moreover, the cohomology 
splits into plus and minus eigenspaces for the hyperelliptic involution y i— — y; 
the former is essentially the cohomology of minus the images of the Weier- 
strass points (since one can eliminate y entirely), so to compute the zeta function 
of X we need only worry about the latter. In other words, we need only consider 
forms f{x)dx/y‘^ with s odd. 

The key reduction formula is the following: if A{x) = P{x)B{x) + P' {x)C{x), 
then 

2C"(x)\ dx 
s — 2 ) y^~'^ 

The case of no rational Weierstrass point is not considered in [24] ; it has been worked 
out by Michael Harrison, and has the same asymptotics. 



y" \ 
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as elements of This is an easy consequence of the evident relation 




in cohomology. 

We use this reduction formula as follows. Compute the image under Frobe- 
nius of (truncating large powers of y or y~^, and p-adically approximating 
coefficients). If the result is 




use the reduction formula to eliminate the j = N term in cohomology, then the 
j = N — 1 term, and so on, until no terms with j > 0 remain. Do likewise with 
the j = —M term, the j = —M + 1 term, and so on (using a similar reduction 
formula, which we omit; note that there are relatively few terms on that side 
anyway). Repeat for i = 0, . . . , 2^ — 1, and construct the “matrix of the p-power 
Frobenius” <P. Of course the p-power Frobenius is not linear, but the matrix of 
the g-power Frobenius is easily obtained as where a here is the 

Witt vector Frobenius and q = p^. 



3.4 Precision 

We complete the calculation described above with a p-adic approximation of a 
matrix whose characteristic polynomial would exactly compute the numerator 
Q{t) of the zeta function. However, we can bound the coefficients of that nu- 
merator using the Weil conjectures: if Q{t) = 1-1- ait -I- • • • -I- then for 

I <i < g, Qg+i = q^ttg-i and 




In particular, computing a* modulo a power of p greater than twice the right 
side determines it uniquely. 

As noted at the end of the previous section, it is critical to know how much 
p-adic precision is lost in various steps of the calculation, in order to know how 
much initial precision is needed for the final calculation to uniquely determine 
the zeta function. Rather than repeat the whole analysis here, we simply point 
out the key estimate [24, Lemmas 2 and 3] and indicate where it comes from. 

Lemma 2. For Afc(a:) a polynomial over W of degree at most 2g and k > 0 
(resp. k < 0), the reduction of Ak{x)y^^~^^ dx (i.e., the linear combination of 
x^ dx/y over i = 0, . . . , 2(/ — 1 cohomologous to it) becomes integral upon multi- 
plication by p‘^ for d > logp((2(/ -|- l)(/c -I- 1) — 2) (resp. d > logp(— 2fc — 1) ). 
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This is seen by considering the polar part of dx around the point 

at infinity if A: < 0, or the other Weierstrass points if fc > 0. Multiplying by 
p'^ ensures that the antiderivatives of the polar parts have integral coefficients, 
which forces the reductions to do likewise. 

It is also worth pointing out that one can manage precision rather simply 
by working in p-adic fixed point arithmetic. That is, approximate all numbers 
modulo some fixed power of p, regardless of their valuation (in contrast to p- 
adic floating point, where each number is approximated by a power of p times 
a mantissa of fixed precision) . When a calculation produces undetermined high- 
order digits, fill them in arbitrarily once, but do not change them later. (That 
is, if X is computed with some invented high-order digits, each invocation of x 
later must use the same invented digits.) The analysis in [24], using the above 
lemma, shows that most of these invented digits cancel themselves out later in 
the calculation, and the precision loss in the reduction process ends up being 
negligible compared to the number of digits being retained. 



3.5 Integrality 

In practice, it makes life slightly^ easier if one uses a basis in which the matrix of 
Frobenius is guaranteed to have p-adically integral coefficients. The existence of 
such a basis is predicted by the comparison with crystalline cohomology, but an 
explicit good basis can be constructed “by hand” by careful use of Lemma 2. For 
instance, the given basis x'^dx/y (t = 0, . . . , 2(/— 1) is only good when p > 2g+l] 
on the other hand, the basis x^dx/y^ {i = 0, . . . ,2g — 1) is good for all p and g. 



3.6 Asymptotics 

As for time and memory requirements, the runtime analysis in [24] together 
with [15] show that the algorithm requires time 0{pn^g'^) and space 0{pn^g^), 
where again g is the genus of the curve and n = log^ q. (Here the “soft O” 
notation ignores logarithmic factors, arising in part from asymptotically fast 
integer arithmetic.) 



4 Variations 

In this section, we summarize some of the work on computing MW-cohomology 
for other classes of curves. We also mention some experimental results obtained 
from implementations of these algorithms. 

^ But only slightly: the fact that there is some basis on which Frobenius acts by an 
integer matrix means that the denominators in the product can be 

bounded independently of n. 
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4.1 Hyperelliptic Curves in Characteristic 2 

The method described in the previous section does not apply in characteristic 2, 
because the equation y'^ = P{x) is nonreduced and does not give rise to hyperel- 
liptic curves. Instead, one must view the hyperelliptic curve as an Artin-Schreier 
cover of and handle it accordingly; in particular, we must lift somewhat 
carefully. We outline how to do this following Denef and Vercauteren [9], [10]. 
(Analogous computations based more on Dwork’s work have been described by 
Lauder and Wan [31], [32], but they seem less usable in practice.) 

Let A be a hyperelliptic curve of degree g over ¥q, with g = 2"; it is defined 
by some plane equation of the form 

j/2 -k h{x)y = f{x), 

where / is monic of degree 2g + \ and deg(h) < g. Let H be the monic squarefree 
polynomial over with the same roots as h. By an appropriate substitution of 
the form y y + a(x), we can ensure that / vanishes at each root of H . 

Let X be the affine curve obtained from X by removing the point at infinity 
and the zero locus of H. Choose lifts H, h, f of H, h, f to polynomials over W of 
the same degree, such that each root of h is also a root of H, and each root of 
/ whose reduction mod p is a root of H is also a root of H. The dagger algebra 
corresponding to X is now given by 

W{x, y, z)V(y^ + h{x)y - f{x),H{x)z - 1), 
and each element can be written uniquely as 

Y,Mx)H^{xy + ^ Byx)yH^{xy 

iez iGZ 

with H\{x) = x\i H is constant and H\{x) = h{x) otherwise, deg(Ai) < deg(.ffi) 
and deg{Bi) < deg{Hi) for all i, and Vp{Ai) + c|z| > d and Vp{Bi) + c|z| > d 
for some c,d with c > 0. The dagger algebra admits a Frobenius lift sending x 
to but this requires some checking, especially to get an explicit convergence 
bound; see [51, Lemma 4.4.1] for the analysis. 

By Theorem 1, the MW-cohomology of X coincides with the cohomology 
of the hyperelliptic curve y^ + h{x)y — f{x) minus the point at infinity and 
the zero locus of H. Again, it decomposes into plus and minus eigenspaces for 
the hyperelliptic involution y i— >■ —y — h{x), and only the minus eigenspace 
contributes to the zeta function of X. The minus eigenspace is spanned by x^y dx 
for i = 0, . . . ,2g — 1, there are again simple reduction formulae for expressing 
elements of cohomology in terms of this basis, and one can again bound the 
precision loss in the reduction; we omit details. 

In this case, the time complexity of the algorithm is 0{rAg^) and the space 
complexity is O(n^p^). If one restricts to ordinary hyperelliptic curves (i.e., those 
where H has degree g), the time and space complexities drop to 0{n^g'^) and 
0{n^g^), respectively, as in the odd characteristic case. It may be possible to 
optimize better for the opposite extreme case, where the curve has p-rank close 
to zero, but we have not tried to do this. 
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4.2 Other Curves 

Several variations on the theme developed above have been pursued. For in- 
stance, Gaudry and Giirel [15] have considered superelliptic curves, i.e., those of 
the form 

= P{x) 

where m is not divisible by p. More generally still, Denef and Vercauteren con- 
sider the class of Ca, h-curves, as defined by Miura [35]. For a, b coprime integers, 
a Ca,b~curve is one of the form 

a— 1 

+ /o(a;) = 0, 



where deg/o = b and a deg fi + bi < ab for i = 1, ... ,a — 1, and the above 
equation has no singularities in the affine plane. 

These examples fit into an even broader class of potentially tractable curves, 
which we describe following Miura [35] . Recall that for a curve C and a point P, 
the Weierstrass monoid is defined to be the set of nonnegative integers which 
occur as the pole order at P of some meromorphic function with no poles away 
from P. Let oi < • • • < a„ be a minimal set of generators of the Weierstrass 
monoid, and put di = gcd(ai , . . . ,ai). Then the monoid is said to be Gorenstein 
(in the terminology of [39]) if for f = 2, . . . , n. 




’77 

J — ^>0 

di-1 






If the Weierstrass monoid of C is Gorenstein for some P, the curve C is said to 
be telescopic] its genus is then equal to 




The cohomology of telescopic curves is easy to describe, so it seems likely 
that one can compute Monsky-Washnitzer cohomology on them. The case n = 2 
is the Ca,b case; for larger n, this has been worked out by Suzuki [47] in what he 
calls the “strongly telescopic” case. This case is where for each i, the map from 
C to its image under the projective embedding defined by 0{aiP) is a cyclic 
cover (e.g., if C is superelliptic). 

We expect that these can be merged to give an algorithm treating the gen- 
eral case of telescopic curves. One practical complication (already appearing in 
the Ca^b case) is that using a Frobenius lift of the form x necessitates 

inverting an unpleasantly large polynomial in y; it seems better instead to itera- 
tively compute the action on both x and y of a Frobenius lift without inverting 
anything. 
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4.3 Implementation 

The algorithms described above have proved quite practicable; here we mention 
some implementations and report on their performance. Note that time and 
space usage figures are only meant to illustrate feasibility; they are in no way 
standardized with respect to processor speed, platform, etc. Also, we believe all 
curves and fields described below are “random” , without special properties that 
make them easier to handle. 

The first practical test of the original algorithm from [24] seems to have been 
that of Gaudry and Giirel [15], who computed the zeta function of a genus 3 
hyperelliptic curve over F 337 in 30 hours (apparently not optimized). They also 
tested their superelliptic variant, treating a genus 3 curve over F 253 in 22 hours. 

Gaudry and Giirel [16] have also tested the dependence on p in the hyper- 
elliptic case. They computed the zeta function of a genus 3 hyperelliptic curve 
over F 251 in 42 seconds using 25 MB of memory, and over F 10007 in 1-61 hours 
using 1.4 GB. 

In the genus direction, Vercauteren [51, Sections 4.4-4. 5] computed the zeta 
function of a genus 60 hyperelliptic curve over F 2 in 7.64 minutes, and of a genus 
350 curve over F 2 in 3.5 days. We are not aware of any high-genus tests in odd 
characteristic; in particular, we do not know whether the lower exponent in the 
time complexity will really be reflected in practice. 

Vercauteren [51, Section 5.5] has also implemented the Co, 6 -algorithm in 
characteristic 2. He has computed the zeta function of a Cs ,4 curve over F 2288 in 

8.4 hours and of a Cs ,5 curve over F 2288 in 12.45 hours. 

Finally, we mention an implementation “coming to a computer near you”: 
Michael Harrison has implemented the computation of zeta functions of hyperel- 
liptic curves in odd characteristic (with or without a rational Weierstrass point) 
in a new release of Magma. At the time of this writing, we have not seen any 
performance results. 

5 Beyond Hyperelliptic Curves 

We conclude by describing some of the rich possibilities for further productive 
computations of p-adic cohomology, especially in higher dimensions. A more 
detailed assessment, plus some explicit formulae that may prove helpful, appear 
in the thesis of Gerkmann [19] (recently completed under G. Frey). 

5.1 Simple Covers 

The main reason the cohomology of hyperelliptic curves in odd characteristic 
is easily computable is that they are “simple” (Galois, cyclic, tamely ramified) 
covers of a “simple” variety (which admits a simple Frobenius lift). As a first 
step into higher dimensions, one can consider similar examples; for instance, a 
setting we are currently considering with de Jong (with an eye toward gathering 
data on the Tate conjecture on algebraic cycles) is the class of double covers of 
of fixed small degree. 
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One might also consider some simple wildly ramified covers, like Artin- 
Schreier covers, which can be treated following Denef-Vercauteren. (These are 
also good candidates for Lauder’s deformation method; see below.) 

5.2 Toric Complete Intersections 

Another promising class of varieties to study are smooth complete intersections 
in projective space or other toric varieties. These are promising because their 
algebraic de Rham cohomology can be computed by a simple recipe; see [19, 
Chapter 5]. 

Moreover, some of these varieties are of current interest thanks to connections 
to physics. For instance. Candelas et al. [6] have studied the zeta functions of 
some Calabi-Yau threefolds occurring as toric complete intersections, motivated 
by considerations of mirror symmetry. 

5.3 Deformation 

We mention also a promising new technique proposed by Lauder. (A related 
strategy has been proposed by Nobuo Tsuzuki [48] for computing Kloosterman 
sums.) Lauder’s strategy is to compute the zeta function of a single variety not in 
isolation, but by placing it into a family and studying, after Dwork, the variation 
in Frobenius along the family as the solution of a certain differential equation.® 

A very loose description of the method is as follows. Given an initial X, say 
smooth and proper, find a family f : Y ^ B over a simple one-dimensional 
base (like projective space) which is smooth away from finitely many points, 
includes X as one fibre, and has another fibre which is “simple”. We also ask 
for simplicity that the whole situation lifts to characteristic zero. For instance, 
if A is a smooth hypersurface, Y might be a family which linearly interpolates 
between the defining equation of X and that of a diagonal hypersurface. 

One can now compute (on the algebraic lift to characteristic zero) the Gauss- 
Manin connection of the family; this will give in particular a module with con- 
nection over a dagger algebra corresponding to the part of B where / is smooth. 
One then shows that there is a Frobenius structure on this differential equation 
that computes the characteristic polynomial of Frobenius on each smooth fibre. 
That means the Frobenius structure itself satisfies a differential equation, which 
one solves iteratively using an initial condition provided by the simple fibre. (In 
the hypersurface example, one can write down by hand the Frobenius action on 
the cohomology of a diagonal hypersurface.) 

Lauder describes explicitly how to carry out the above recipe for Artin- 
Schreier covers of projective space [28] and smooth projective hypersurfaces [29]. 
The technique has not yet been implemented on a computer, so it remains to 
be seen how it performs in practice. It is expected to prove most advantageous 
for higher dimensional varieties, as one avoids the need to compute in multidi- 
mensional polynomial rings. In particular, Lauder shows that in his examples. 



Lest this strategy seem strangely indirect, note the resemblance to Deligne’s strategy 
[8] for proving the Riemann hypothesis component of the Weil conjectures! 
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the dependence of this technique on d = dim X is exponential in d, and not d^. 
(This is essentially best possible, as the dimensions of the cohomology spaces in 
question typically grow exponentially in d.) 

5.4 Additional Questions 

We conclude by throwing out some not very well-posed further questions and 
suggestions,. 

— Can one can collect data about a class of “large” curves (e.g., hyperelliptic 
curves of high genus) over a fixed field, and predict (or even prove) some 
behavioral properties of the Frobenius eigenvalues of a typical such curve, in 
the spirit of Katz-Sarnak? 

— With the help of cohomology computations, can one find nontrivial instances 
of cycles on varieties whose existence is predicted by the Tate conjecture? 
As noted above, we are looking into this with Johan de Jong. 

— The cohomology of Deligne-Lusztig varieties furnish representations of finite 
groups of Lie type. Does the p-adic cohomology in particular shed any light 
on the modular representation theory of these varieties (i.e., in characteristic 
equal to that of the underlying field)? 

— There is a close link between p-adic Galois representations and the p-adic 
differential equations arising here; this is most explicit in the work of Berger 
[2] . Can one extend this analogy to make explicit computations on p-adic Ga- 
lois representations, e.g., associated to varieties over Qp, or modular forms? 
The work of Coleman and lovita [7] may provide a basis for this. 
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Abstract. This paper gives a survey of some ways to improve the ef- 
ficiency of discrete log-based cryptography by using the restriction of 
scalars and the geometry and arithmetic of algebraic tori and abelian 
varieties. 



1 Introduction 

This paper is a survey, intended to be readable by both mathematicians and 
cryptographers, of some of the results in [24,25,26], along with a new result in 
§3.6. It can be viewed as a sequel to the Brouwer-Pellikaan-Verheul paper “Doing 
more with fewer bits” [8]. 

The overall objective is to provide greater efficiency for the same security. 
The idea is to shorten transmissions by a factor of by going from a finite 
field Fg up to the larger field F^n, and using “primitive subgroups”. Here, 
is the Euler (/^-function. Note that n/(^(n) goes to infinity (very slowly), as n 
goes to infinity. 

The first goal is to obtain the same security as the classical Diffie-Hellman 
and ElGamal cryptosystems, while sending shorter transmissions. More precisely, 
the goal is to do discrete log-based cryptography, relying on the security of F^„, 
while transmitting only (/?(n) elements of F^, instead of n elements of F^ (i.e., 
one element of F^n). We use algebraic tori. The next goal is to improve pairing- 
based cryptosystems. Here, we use elliptic curves E and primitive subgroups of 
E(F,n). 

As pointed out by Dan Bernstein, the techniques discussed here can be viewed 
as “compression” techniques, adding more flexibility for the user, who might 
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choose to send compressed information when the network is the bottleneck and 
uncompressed information when computational power is the bottleneck. 

In §2 we discuss some background and past results on compressing the trans- 
missions in discrete log-based cryptography for the multiplicative group. In §3 
we give an exposition of torus-based cryptography; we give a new implementa- 
tion of CEILIDH in §3.6. In §4 we show how to compress the transmissions in 
pairing-based cryptosystems. In §5 we discuss some of the underlying mathe- 
matics, including an elementary introduction to the Weil restriction of scalars; 
we define “primitive subgroup” in §5.5. In §6 we discuss the mathematics un- 
derlying torus-based cryptography, and interpret some earlier systems in terms 
of quotients of algebraic tori. 

For technical details, see the original papers. See also [11] (especially §3.2) 
for the use of primitive subgroups in cryptography. 



2 Some Background 

We first recall the classical Diffie-Hellman key agreement scheme [10,21]. 



2.1 Classical DifRe- Heilman 

In classical Diffie-Hellman key agreement, a large finite field Fg is public {q ~ 
2^°^"*), as is an element G of large (public) multiplicative order £ (> 2^®°). 
Alice chooses a private integer a, random in the interval between 1 and — 1, 
and Bob similarly chooses a private integer b. 

— Alice sends g°“ to Bob. 

— Bob sends to Alice. 

— They share = (5“)*' = (g'’)“. 

Tautologically, the security is based on the difficulty of the Diffie-Hellman 
Problem in F^ . 

Note that when this is performed using F^n in place of Fg, then the transmis- 
sions are elements of F^n (i.e., n elements of F^). If one can do Diffie-Hellman 
transmitting only ip{n) elements of F^ while relying on security coming from 
F^n, then one would like to have nlog(<7) large for high security, and (fi{n) log(g) 
small for high bandwidth efficiency. In particular, for maximal efficiency per 
unit of security (i.e., to achieve a system that is times as efficient as Diffie- 
Hellman), one would like to be as large as possible. Thus, the most useful 
n’s to consider are those in the sequence 

1, 2, 2-3 = 6, 2 • 3 • 5 = 30, 2 • 3 • 5 • 7 = 210, . . . 



(whose f-th entry is the product of the first i — 1 primes) . We will discuss some 
ways to do this, below. 
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2.2 A Brief Tour of Some History 

As noted in [17,8], one can achieve greater efficiency per unit of security by 
choosing g in the subgroup of of order where (!>n{x) is the n-th 

cyclotomic polynomial. (The polynomial has integer coefficients, and its 

(complex) roots are the primitive n-th roots of unity.) 

Diffie-Hellman key agreement is based on the full multiplicative group , 
which is a group of order q — 1 = <Pi{q). 

In [22,31,32,28,29,25], analogues of the classical Diffie-Hellman key agreement 
scheme are introduced that rely on the security of F^a while transmitting only 
one element of Fp. One now takes the element g to lie in the subgroup of F^a of 
order p + 1 (= <p 2 {p))- Since n = 2, we have n/(p{n) = 2, and achieve twice the 
efficiency of Diffie-Hellman for comparable security. The papers [22,31,32,28,29] 
use Lucas sequences [20], to give what are known as Lucas-based cryptosystems. 
See [4] for a critique of [28,29]. In [25] (see §3.4 below) we introduced the T 2 - 
cryptosystem, 

which is a torus-based system. It is related to the Lucas-based cryptosystems 
(see §6.5 below), and has some advantages over them. 

The Gong-Harn system [13] uses linear feedback shift register sequences. In 
this case n = 3, so nj(p{n) = 1.5. This cryptosystem relies on the security of F ^3 
while transmitting only two elements of Fp, using the subgroup of F^a of order 
p"^ +P+1 (= ^sip)). 

The case where n = 6 (so nj(p{n) = 3) is considered in [8], [19] (the XTR 
system), and [25] (the CEILIDH system). These systems give three times the 
efficiency of Diffie-Hellman, for the same security. They rely on the security of 
Fpg while transmitting only two elements of Fp, using the subgroup of F^e of 
order p'^ — p + 1 (= <I>q{p)). 

Arjen Lenstra [18] has asked whether one can use n = 30 to do better than 
XTR. Note that (/^(30) = 8 and 

^3o{x) = + X + 1. 

Building on a conjecture in [8], conjectures for arbitrary n were given in [6]. 
Those conjectures were disproved in [6,25,26], and it was proposed in [25,26] 
that a conjecture of Voskresenskii should replace those conjectures. 

2.3 Classical ElGamal Encryption 

As before, the public information is a large finite field Fq and an element (/ G F^ 
of order £, along with q and 

Alice’s private key: an integer a, random in the interval [I,!" — 1] 

Alice’s public key: Pa = 5 “ G Fq 

— Bob represents the message M in {g) and chooses a random integer r between 
1 and Bob send Alice the ciphertext (c, d) where c = g^ and d = M-P^. 

— To decrypt a ciphertext (c,d), Alice computes 

d.c-“ = M-(5“)’--(/)-“ = M. 
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2.4 Classical ElGamal Signatures 

With public information as before, also fix a public cryptographic hash function 
H : {0, 1}* — Z/£Z (i.e., H takes bit strings to integers modulo £, is easy to 
compute and hard to invert, and its images look “random”). 

Alice’s private key: an integer a, random in the interval [1,^— 1] 

Alice’s public key: G Fg 

~ To sign a message M G {0, 1}*, Alice chooses a random integer r between 1 
and £ — 1 with gcd(r, £) = 1. Alice’s signature on M is (c, d) where c = 
and d = r~^{H{M) — aH{g'')) (mod £). 

— Bob accepts Alice’s signature if and only if 

„H(M) _ pH(c) d 

9 — c 



in the field Fg. 

Remark 1. Note that Diffie-Hellman key agreement only requires exponentia- 
tions (i.e., computing powers of elements in the group generated by g), while the 
ElGamal encryption and signature schemes require multiplications in the finite 
field (i.e., M • P^, • d, and • c^). 

2.5 Usiug XTR to Illustrate the Idea 

We give an illustration, in the case n = 6, of the idea behind [8,13,19] and the 
Lucas-based cryptosystems. 

XTR is short for ECSTR, which stands for Efficient Compact Subgroup Trace 
Representation. 

The trace is the trace map from Fp6 to Fp 2 , which is defined by 

Tr{h) = h + h^ + = h + a{h) + 

where a generates the Galois group Gal(Fpe/Fp 2 ). (Note that = h.) 

The subgroup is the subgroup of F^g of order p“^ — p + 1 = d^e(p) . Ghoose a 
generator g of this subgroup. 

— Alice sends Tr(g“) to Bob. 

— Bob sends Tr(g^) to Alice. 

— They share Tr((/“^). 

Since the transmissions are elements of Fp 2 , Alice and Bob are sending 2 
(= V5(6)) elements of Fp, rather than 6 elements of Fp (i.e., one element of Fp6, 
as would be the case in classical Diffie-Hellman over the field Fpe). The point 
is that the trace gives an efficient compact representation of elements in the 
subgroup {g). 

We claim that Alice and Bob now share Tr((/“^) G Fp 2 . This is proved in [19], 
where an efficient way to compute Tr(g“^) is given. Let’s convince ourselves that 
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Alice and Bob really do have enough information to compute Tr(g“^). Suppose 
that h is an element of the subgroup of of order — p+1. Let 

Ch = {h,a{h),a'^{h)}. 

The three elementary symmetric polynomials of the set Ch are: 

ni{Ch) = h + a{h) + a^{h) = Tr{h), 

n2{Ch) = h ■ a{h) + h • a^{h) + a{h) ■ a^{h) = Ti{h ■ a{h)), 

n3{Ch) = h-a{h)-a^{h) = N{h), 

where N : F^e — >■ Fp 2 is the norm map. It turns out that if h is in the subgroup 
of order p"^ — p + 1, then Il 2 {Ch) = Tr(/i)P and II^{Ch) = 1- 

Thus, knowing Tr(/i) is equivalent to knowing the values of all the elemen- 
tary symmetric polynomials of Ch, which is equivalent to knowing the set Ch- 
However, if you know Ch and you know a, then you know Ch<^, just by taking 
every element of Ch to the power a. But we have already noted that knowing 
C/ja is equivalent to knowing Tr(/i“). 

To sum up, if h is in the subgroup of F^e of order p'^ — p + 1, then a and 
Tr(/i) together determine Tr(/i“). Since Alice knows Tr((/^) and a, she has enough 
information to compute Tr((g^)“), and similarly Bob can compute Tr((g“)*'). 

Note that knowing Ch is equivalent to knowing the characteristic polynomial 
of h over Fp 2 , since that characteristic polynomial is 

l[{x-c) = x^- n^{Ch)x^ + n 2 {Ch)x - n^{Ch). 

c^Ch 



Remark 2. In XTR [19], the Gong-Harn system [13], and the Lucas-based cryp- 
tosystems, Alice can compute /((/“*') from f{g^) and a, for a suitable function 
/ (usually a trace). In other words, these cryptosystems can exponentiate, as 
is needed for doing (analogues of) Diffie-Hellman. However, they cannot multi- 
ply in a straightforward way. If you know Tr((/) and Tr(/i), that does not give 
you enough information to compute Tr{gh), since Cg and Ch do not determine 
the set Cgh (knowing only Cg and Ch, you do not have enough information to 
distinguish Cgh from Cg.cr(h), for example). These are examples of “lossy” com- 
pression. If one orders the conjugates of h and transmits a couple of extra bits 
to specify which conjugate h is, then one can reconstruct h from Tr(ft,), and 
perform multiplications in F^e . 

3 Torus-Based Cryptography 

The goal is to find a computable function / satisfying the following properties: 

— the number of bits needed to represent f{h) is less than the number of bits 
needed to represent h (ideally, f{h) is as long as h), 
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— f{h) and a determine /(/i“) and h°-, 

— f{g) and f{h) determine f{gh) and gh, 

~ / is defined on almost all elements of the subgroup of F^n of order <l>n{q)- 

Note that these conditions imply that / has a computable inverse function. 
From now on, fix a square-free integer n and a prime power q. (Square-free 
means that the only square that divides n is 1.) 

Definition 3. Let T„ denote the subgroup of of order 



Example 4. (i) Diffie-Hellman is based on the group T\ = F^ . 

(ii) If q is not a power of 2, one can write F ,^2 = F,(\/(i). Then 

T2 = {a + bVd : a,b G¥q and (a -f bVd)'^^^ = 1} 

= {a -I- bVd : a,b &¥q and — db^ = 1} C F^a, 

since (a -I- b^fdY = a — by/d. 

Choose a prime power q of about 1024/n bits, such that ^n{q) is divisible 
by a large prime. Choose g & Tn whose order £ is divisible by that large prime. 
Suppose for now that one has efficiently computable maps 

i 

Tn (1) 

/ 

that are inverses of each other. The dotted arrows signify that these maps need 
not be defined everywhere; they might be undefined at a “small” number of 
elements. In §3.4, §3.6, §6.3, and [25] we discuss the maps / and j, and give 
explicit examples. The following protocols are generalized Diffie-Hellman and 
ElGamal [21], using the subgroup T„ of F^n. In §3.7 below we discuss how to 
represent the message in (g) . Note that the maps / and j allow one to compress 
transmissions not only for Diffie-Hellman and ElGamal, but also for any discrete 
log-based system that can use a general group. 




3.1 Torus-Based Diffie-Hellman Key Agreement 

Alice chooses an integer a randomly in the interval [1,£ — 1]. Similarly, Bob 
chooses a random integer b from the same range. 

— Alice sends Pa = f{g°‘) G Fg^”^ to Bob. 

— Bob sends Pb = f{g^) G Fg^"^ to Alice. 

— They share {j{PB)T = 9°’^ = UiPA)/, and also f{g°^^). 
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3.2 Torus-Based ElGamal Encryption 

Alice’s private key: an integer a, random in the interval [ 1 ,^— 1 ] 

Alice’s public key: = f{g°‘) G 

— Bob represents the message M in {g) and picks a random r between 1 and 
£ — 1. The ciphertext is (c, d) where c = /((/’’) and d = f{M ■ j{PAY)- 

— To decrypt a ciphertext (c, d), Alice computes M = j{d) • 



3.3 Torus-Based ElGamal Signatures 

Fix a cryptographic hash function H : {0, 1}* — >■ Z/^Z. 

Alice’s private key: an integer a, random in the interval [1,^— 1] 

Alice’s public key: Pa = f{g°') G 

— To sign a message M G {0, 1}*, Alice chooses a random integer r between 
1 and £ — 1 with gcd(r, f) = 1. Alice’s signature on M is (c, d) where c = 
f{gY G F^^”^ and d = r~^{H{M) - aid(c)) (mod £). 

— Bob accepts Alice’s signature if and only if 

The signature length is (p{n) log 2 (< 7 ) + log 2 (^) bits, as opposed to nlog 2 ((?) + 
log 2 (^) bits in the classical ElGamal signature scheme over Fgn. 



3.4 The T 2 - Cryptosystem 

Here, n = 2. Choose a prime power q that has about 512 bits, and such that 
is a prime. One can write Fg2 = Fg(-\/d) for some non-square d G F^ . Define 



j : Fg — >■ T2 



by 



J'(a) 



a + Vd 
a — Vd 



Define an inverse map (defined on T 2 — {1,-1}): 

/ : T 2 ^Fg by f {a + bVd) = . 

It is easy to check that if o, 5 G Fg and a yf — then 

j{a)j{b) = 

' a + b ' 

In the T 2 -cryptosystem, one does Diffie-Hellman key agreement and ElGamal en- 
cryption and signatures, using the group law on the group T 2 , while representing 
the elements in Fg. Here, it is not necessary to go back and forth between Fg 
and T 2 , since the previous equation translates T 2 ’s multiplication to Fg, i.e., 
multiplication in T 2 translates into the following operation on Fg: 



(a, b) !->■ 



ab+ d 
a + b ^ 



giving a way to compose elements of Fg without having to pass to T 2 each time. 
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3.5 The CEILIDH Public Key System 

The acronym CEILIDH (pronounced “cayley”) stands for Compact, Efficient, 
Improves on LUC, Improves on Diffie-Hellman. The CEILIDH key agreement 
(resp., encryption, resp., signature) scheme is torus-based Diffie-Hellman (resp., 
ElGamal encryption, resp., ElGamal signatures) in the case n = 6. 

Examples 11 and 12 of [25] give explicit examples of maps / and j (called p 
and ■i/' there) when n = 6. We give a new example in §3.6 (and use it in §3.7). 



3.6 An Explicit Example of Maps / and j 

Take an odd prime power q congruent to 2, 6, 7, or 11 (mod 13) and such that 
^e(g) is prime. Then Fg(Ci3) = F^i2, where C13 is a primitive 13-th root of unity, 
and Fq(z) = F^e, where z = Ci 3 + Let 

2/ = Cl3 + Cl3^ + C 13 + Cl3^ ^ 



For M, V G Fg, define 



where 



j{u,v) 



r - s\/l3 

^ -^6 

r -I- sVTS 



r = (3(u^ -I- v^) -I- 7uv + 34m -|- 18m -I- 40)y^ -I- 26uy 

- (21m( 3 -b m) -b 9 (m^ -b v^) + 28m -b 42), 



s = 3(m^ -b M^) -b 7 mm -b 21 m -b 18 m -b 14. 



For t € Tq, define 



fit) = ( 



M M — 3 
m; -b I ’ m; -b I 



G F 



2 

9’ 



with 



1 I (X 

t = a + 6VT3 and — - — = wy^ + u{y + —) + v 



where t is written with respect to the basis {l,vT3} for F^e/F^a, with a, b £ 
Fq3 = Fg(y), and is written with respect to the basis {y‘^,y + ^,1} for 
Fqs/Fq, with M, M, IM G Fq. 

Then / and j are inverses. The map j : F^ — >■ Tq is defined on all of F^. The 
map f - Tq ^ Fq is defined except at 1 and — 2z® -b — 4z — 1 GTq. 



3.7 Representing Elements of in (g) 

For torus-based ElGamal encryption, how does one represent a message as an 
element of {g)7 First, represent the message as an element M in Fq*'”\ 
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If g is taken to be a generator of T„, then taking j{M) represents the message 
in (g) (where j is as in (1)). Note that g is a generator of T„ whenever <Pn{g) is 
prime. 

If g is taken to be in an index s subgroup of for some small integer s, then 
by adding a few bits of redundancy to M, after at most a few tries one obtains 
an M such that j{M) is in (g). If g has order £, one can test whether j{M) is in 
(g) by checking whether j{MY = I. 

How does one represent the message in (g) when n = 6? 

Take a prime r and an odd prime power q such that the order of q (mod r) 
is divisible by 6 but is not 6 itself, and such that (£>e{q) is prime. (One expects, 
but cannot prove, that there are infinitely many such q; it is not hard to find 
some in a suitable range for cryptography, e.g., such that q has about 170 bits, to 
get 1024-bit security.) These conditions ensure that Fg(Cr) contains F,j6, where 
Cr is a primitive r-th root of unity. (Note that if the order of q (mod r) is 6, 
then (l>e{q) is divisible by 6, so is not prime. Note also that the condition that 
the order of q (mod r) is divisible by 6 implies that r = 1 (mod 6).) In the case 
r = 13, one can use the example given in §3.6. Here, one represents the message 
in F^, and uses the map j to put it in the prime order group Tq = (g). 

In Example 11 of [25], we have g = 2 or 5 (mod 9). Here, ^eiq) is divisible 
by 3. One can choose the prime power q so that <pQ{q)/3 is prime. If one takes g 
to have order (£>e{q), then j{M) is in {g) = Tq. 

Similarly for Example 12 of [25], we have g = 3 or 5 (mod 7). Now <?6(<z) is 
divisible by 7. One can choose g so that <pQ{q)/7 is prime. If g is taken to have 
order <pQ{q), then j{M) G (g) = Tq. 

The following sample parameters are all the primes g between 2™ — 10^ and 
2170 _|_ 2 ^q 5 g^ch that g^ — g -I- 1 is prime and g has order 12 modulo 13: 

1496577676626844588240573268701473812127674923933621, 

1496577676626844588240573268701473812127674923946773, 

1496577676626844588240573268701473812127674923949251, 

1496577676626844588240573268701473812127674924018047, 

1496577676626844588240573268701473812127674924027533. 



3.8 Comparison between CEILIDH and XTR 

The security of CEILIDH is exactly the same as that of XTR, with the same 
security proof; they both rely on the security of the “hardest” subgroup of F^e 
(see §3.11). Parameter selection for CEILIDH is exactly the same as for XTR. 

The advantage of the T 2 -cryptosystem and CEILIDH over LUC and XTR is 
that T 2 and CEILIDH make full use of the multiplication in the group T„ (for 
n = 2 and 6). This is especially useful for signature schemes. XTR is efficient for 
key agreement and hybrid encryption (i.e., using a Diffie-Hellman-like protocol 
to exchange a secret key, and using symmetric key encryption, not public key 
encryption). CEILIDH can do efficient key agreement, public key (i.e., non- 
hybrid) encryption, and signatures. 
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XTR has computational efficiency advantages over CEILIDH (key agreement 
can be performed with fewer operations). 

3.9 Conjectural T„- Cryptosystems 

Whenever / and j exist as in (I), one has a “T„-cryptosystem” , or T„ com- 
pression technique. As in §3.I-§3.3, use / to compactly represent transmissions 
in and use j to send elements of to the group T„, where group 

operations can be performed. 



3.10 Parameter Selection when n — 30 

For torus-based ElGamal signatures, finding good parameters when n = 30 
amounts to finding prime powers q of about 1024/30 « 35 bits such that ^^ 30 ( 9 ) 
has a prime factor £ of about 160 bits. Here is a method for doing this: 

— choose a 20-30 bit prime p=l (mod 30), 

— find the xi, . . . ,xg with 1 < Xi < p whose orders modulo p are 30, 

— find 35-bit primes q congruent to some Xi (mod p ) , 

— factor out small (< 90-100 bits) prime divisors from the integer (1^3o{q)/p, 

— see if what is left is a prime of about 160-bits. 

Paul Leyland suggested doing the factorization step by using the Elliptic 
Curve Method optimized for 90 - 100 bit factors. Using this, he can obtain a 
few examples per hour on a laptop. 

Note that the parameters are like Diffie-Hellman parameters — they do not 
need to be changed often, and the same q and g can be used for all users. 

The table below gives some pairs of primes q and £ where q has 35 bits, £ has 
160 or 161 bits, and £ divides <^ 30 ( 9 ). One expects there to be about 

717267168(ln(161) - ln(160)) « 4.47 x 10® 

35-bit primes q such that <^ 30 ( 9 ) has a 160-bit prime divisor (717267168 is the 
number of 35-bit primes). 



q 


£ 


18849585563 

18859507111 

18918018433 

18937704077 

19020912667 

19096959863 

19123281371 

19200181867 

19241156549 


2721829278598645763229135555203875381215025850251 

1145377552213689334808880803247608425700596690441 

2191067457957167273280468413326196522745324110911 

2622917550423816956639040650402145314798081975731 

2009907944188511109843286107856362388569736938661 

2670351518767065322212846696686298421468094820481 

1089731979081189465083403285791765213322453796291 

1382108007746224782292716444254570494753142184301 

1292631930593942028414888386684571922308680383411 
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3.11 Security 

The security of all the systems discussed thus far is the discrete log security of 
the “hardest” subgroup of F^n, in the following sense. The group F^„ is “almost 
the same” as the direct product 

Y\Td = T„ X n T’d 

d\n d\n 

d^n 



(there are homomorphisms between them for which the prime divisors of the 
orders of the kernel and cokernel all divide n); see pp. 60-61 of [30]. 

We have Td C F^^ for all d, so for d < n the elements of these subgroups 
lie in a strictly smaller field than F^n. Therefore, these groups Td are weaker 
for cryptographic purposes — they are vulnerable to attacks on the discrete 
logarithm problem in F^,j, where now d < n. 

Almost none of the elements of T„ lie in a smaller field than F^n (see Lemma 1 
of [6]). Therefore, T„ can be viewed as the cryptographically strongest subgroup 
ofF,"„. 

4 Improving Pairing-Based Cryptography 

Inspired by and building on a paper of Galbraith [12], in [24] we use the the- 
ory of supersingular abelian varieties to improve the efficiency of pairing-based 
cryptosystems. 

Pairing-based cryptography was conceived of independently by Joux [14] 
and by Sakai, Ohgishi, and Kasahara [27]. There are numerous applications of 
pairing-based cryptography, including tripartite Diffie-Hellman, identity-based 
encryption, and short signatures. See [1] for numerous references and informa- 
tion. 

The Boneh-Lynn-Shacham (BLS) short signature scheme [5] uses pairings 
associated with elliptic curves. The question of whether one can use abelian va- 
rieties (which are higher dimensional generalizations of elliptic curves) to obtain 
shorter signatures was stated as an open problem in [5], and answered in the 
affirmative in [24]. While we arrived at our method (see §4.2 below) for com- 
pressing BLS signatures by studying the arithmetic of abelian varieties, in fact 
our final algorithm can be performed entirely using elliptic curve arithmetic, 
without going to higher dimensional abelian varieties. 

The Rubin-Silverberg (RS) modification of the BLS signature scheme mul- 
tiplies the security of BLS signatures by n while multiplying the signature size 
by Implementations when n = 3 and n = 5 are given in [24]. We give an 
example when n = 5 in §4.2 below. 

Our methods can be used to improve the bandwidth efficiency of any pairing- 
based cryptosystem, not just the BLS signature scheme. 
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4.1 BLS Short Signature Scheme 

We give an example of the Boneh-Lynn-Shacham signature scheme, with fixed 
parameters. 

Let q = Consider the elliptic curve : y'^ = — x + 1 over Fg, and 

take P € E~^(Fq) of (prime) order 

£ = 2726865189058261010774960798134976187171462721. 

Note that #E~^(Fg) = 71. 

Use a pairing 

e: (P) X (P) ^ 

that satisfies 

e{aP, bP) = e(P, P)®** for every a,b 
e(P,P)^l. 

One can use a modified Weil or Tate pairing [15]. 

The public information is q, , P, £, e, and a cryptographic hash function 

H: {0,1}*^(P). 

Alice’s private key: an integer a, random in the interval [l,f] 

Alice’s public key: Pa = aP 

— To sign a message M G {0,1}*, Alice computes Pm = H{M) and oPm = 
(s, t) G (P). 

— Alice’s signature is s G F^ (and 1 bit to recover the sign of t). 

— To verify the signature. Bob computes 



t = ±\/s3 — s + 1 G Fg, 



lets 

P' = (s,t) ( = oPm), 



and checks that 

e(P,P') = e(P^,PM). 



4.2 RS Compression of BLS Signatures 

We give an example with fixed parameters, with n = 5. Let q' = 3^® and let 
9 = = 3®^. Consider the elliptic curve E~ : y'^ = x^ — x — 1, and take 

P G P“(Fg) of (prime) order 

£ = 6733238586040336762338876960599521. 



Note that 

#P-(F,) = 271 • 1162320517 • £, 
#p-(F3s) = 271, #P-(F,/) = 1162320517. 
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Take a pairing e and a hash function H as before. Let cr be a generator of 
Gal(Fq/F,/). For Q G if“(Fg), 

Trp^/F_^, (Q) = Q + <j{Q) + cr^(Q) + cr^iQ) + cr'^iQ)- 

Let 

^0 = {Q G -S' : Tr]f^/F^, (Q) = Oe-}, 

the “trace -0 subgroup” of E~{¥g). Then Aq has order 271 • £. Since P has order 
£, we have P G Aq. 

Alice’s private key: an integer a, random in the interval [l,f] 

Alice’s public key: Pa = aP 

— To sign M, as before, Alice computes Pm = H{M) and uPm = 

— Letting (sq, Si, §2) SS) ■S4) be the coordinates of s with respect to a basis for 
Fq over F^/, Alice’s signature is (51,52,83,54) (and 6 bits to recover 5 q and 
t). 

— To verify the signature. Bob first uses that Tr^^/p^, (P) = Oe- to reconstruct 
5o (see below). 

— Bob then, as before, computes 



t = ±\/53 — 5 — 1 G F,j, 



lets 

P' = (s,t) ( = oPm), 

and checks that 

e(P,P') = e(PA,PM). 

The process of reconstructing 5 q and t from 51,52,53,54 is as follows. The 
input is (54,52,53,54) G F^, and the output will be SQ,t G F^/. Viewing F^ as 

¥q'{z) with — z + 1 c = S + X)i=i define Oq, • • • , 04 G Fg/[S'] 

by 

4 

J^(V — (T*(c)) = 04!^^ -|- ciqY^ 02!^^ + aiY ag. 

i=0 

The trace -0 condition can (eventually) be reduced to finding simultaneous solu- 
tions of Pi = 0 and p2 = 0, where pi and p2 are as follows: 



Pi = A®-a 4 A’^-|-(l-l-a 4 - 03 )A®-|-(a 4 -a 4 -a 2 )A®-|-(a 4 -a 4 -|-a|-a 3 -a 4 a 2 )A‘* 

-l- (1 — 04-1-04 — O4 — 03 -|- O4O3 -l- 02 — 03O2 -l- Oo)V^ 

+ ( — 1 + (3.4 — 04 + G-4 + 04 + + 0.403 — O3 — O3 — O2 — O4O2 H“ O4O3O2 H“ 02)^ 

-I- (—1 — 04 — 04 — 04 — 04 — 04-1-03-1- 0403 — O4O3 — O4O3 — O3 
— O4O3 -I- O40I — 02 — O4O2 — O4O2 -b O3O2 — O4O3O2 — 0302)^ 

-b 1 — O4 — O4 -b O4 -b O3 — O4O3 -b O3 — O4O3 -b O3, 




Using Primitive Subgroups to Do More with Fewer Bits 



31 



P 2 — + ( — 1 — 04 — O 4 + 0 - 2 )^^ ~t~ ( — 1 + O 4 — CI 3 — CI 4 CL 2 + 

+ ( — 1 — 04+04 + 04 — CI 3 — O 4 O 3 — 02 + O 4 O 2 — 0302 )^ — 1 + O 4 — O 3 . 

Taking the resultant of pi and p 2 eliminates the variable X, and gives a degree 
27 polynomial h G Fg/[S'] that has sq as a root. The extra 6 bits allow one to 
decide which root of h to take for sq, and to determine t. The polynomial h{S) 
is of the form hi{S^ — S) for a certain degree 9 polynomial hi{S) G and 

this simplifies finding the roots of h. See §5.1 of [24] for an explanation of this 
reconstruction step. 

RS compression was arrived at by studying the Weil restriction of scalars 
of elliptic curves (which are abelian varieties), and understanding the theory of 
abelian varieties. In §5.7 we discuss some of the underlying mathematics. 

Remark 5. In elliptic curve point compression and in BLS, an elliptic curve 
point (x,y) is compressed to its x-coordinate, giving lossy compression. One 
can transmit an extra bit that determines the j/-coordinate, in order to fully 
reconstruct the point. The signature (si, S 2 , S 3 , S 4 ) above is similarly an example 
of lossy compression; the extra 6 bits and the reconstruction step allow one to 
fully recover the elliptic curve point (s,t). 

4.3 Comparison 

RS compression (§4.2) produces signatures that are roughly | as large as BLS 
signatures with comparable security. In both cases, the security is based on the 
difficulty of the Elliptic Curve Diffie-Hellman Problem in (P). RS signing is 
no more work than for BLS. Compared with BLS, RS verification requires an 
additional reconstruction step to recover sq . For applications in which the verifier 
is powerful, this is not a significant problem. 

Note that RS compression (like BLS) only uses elliptic curve arithmetic, and 
does not use any abelian variety arithmetic. 

Bernstein and Bleichenbacher have compressed RSA and Rabin signatures 
([2,3]). In Table 1 below, BCR stands for Bleichenbacher ’s Compressed Rabin 
signatures, DSA is the Digital Signature Algorithm, and ECDSA is the Ellip- 
tic Curve Digital Signature Algorithm. In the middle column of Table 1, the 
signatures are all scaled to 102+bit RSA security. In the remaining columns 
the signatures are scaled to the MOV security of the RS scheme. The MOV 
security refers to attacks on the discrete log problem in F^e- The DL security 
refers to generic attacks on the group (P); the relevant value for DL security is 
log 2 (^)-bits, where £ is the order of P. (See [5,24].) 

There is an RS scheme similar to the one in §4.2 (see §5.2 of [24]) that uses 
elliptic curves over binary fields F 2 U) . Working over binary fields might yield some 
efficiency advantages. However, due to Coppersmith’s attack on the discrete log 
problem in low characteristic [9], larger parameters should be used. 

To achieve the flexibility of higher characteristic, in §6 of [24] we suggest 
the use of (Jacobian varieties of) certain twists of Fermat curves. In a recent 
preprint giving an expanded version of [5], Boneh, Lynn, and Shacham suggest 
using MNT elliptic curves. 
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Table 1. Signature lengths, in bits, for comparable MOV security 



system 




RSA 


904 


1024 


2045 


BCR 


452 


512 


1024 


DSA 




320 




ECDSA 




320 




BLS 


152 


172 


342 


RS 


127 


143 


279 



5 The Underlying Mathematics 

5.1 Varieties and Algebraic Groups 

Definition 6. Loosely speaking, an algebraic variety (over a field k) is the so- 
lution set of a system of polynomial equations (whose coefficients are in k) . An 
algebraic group (or group variety) over a field A: is a variety over k such that the 
group law and the inverse map are quotients of polynomials whose coefficients 
are in k. 

5.2 The Weil Restriction of Scalars 

Suppose that V is a variety over a field L. This means that V is the solution 
set of a system of polynomial equations /i(xi, . . . , Xr) = 0, 1 < z < s, where the 
polynomials fi have coefficients in the field L. Suppose fc is a subfield of L, and 
n is the degree of L over k. Fix a basis {wi, . . . ,Vn} for L over k. Write Xi = 
with variables yij. Substitute this into the equations fi{xi, . . . , Xr) = 
0. Multiplying out, writing everything with respect to the basis {wi, . . . , w„}, and 
equating coefficients, one obtains a system of polynomials in the variables {yij}, 
with coefficients in the field k. The variety defined by these new equations is 
denoted Res^/feV, and is called the (Weil) restriction of scalars from L down to 
k. It is a variety over k with the property that its fc-points are the L-points of 
V: 

(ReSi/fcV)(fc) ^ V{L). 

Its dimension is n • dim(V). See for example §3.12 in Chapter 1 of [30] for more 
information. 

5.3 The Multiplicative Group Gm 

Diffie-Hellman is based on the multiplicative group, denoted Gm- Over any field 
F, the F-points on G^ are 

GUf') = = F - {0}, 

the multiplicative group of invertible elements of the field F. The algebraic 
variety Gm is defined by the equation xy = 1, i.e., it consists of the elements x 
such that there exists a y with xy = 1. It is an algebraic group over any field k. 
We will view Gm as an algebraic group over the field F^. 
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5.4 The Restriction of Scalars ReS]f^„ 

The Weil restriction of scalars Resr^n/F^Gm is an algebraic variety (in fact, an 
algebraic group) over Fg. We have 

(ResF^„/F,Gm)(Fq) = Fg„. 

Example 7. To find equations defining the two-dimensional algebraic variety 
ResFg/FaGm, write Fg = F 3 (-v/i^), and write x = x\ + X 2 \f-^ and y = y\ + 
yi^J Substituting into xy = 1 and equating coefficients gives the equations: 

a^ij/i - a^2j/2 = 1, xiy2 + X2yi = 0. 

5.5 The Primitive Subgroup Go 

Suppose that G is a commutative algebraic group over a field k. In the cases of 
interest to us, V will be the multiplicative group G^ or an elliptic curve. For 
now, we write G’s group operation as multiplication. 

If L is a field that is a finite extension of k, define the primitive subgroup Gq 
of ReSi/feG to be 

Go = ker[ResL/fcG — ^ 0 Resp/kG], 

kCFCL 

where the norm maps induce the usual norm maps 

^L/F ■■ G{L)^G{F), x^ n 

a^Ga\(L/F) 

Then Gq is an algebraic group over k, and Go{k) consists of all elements of 
G(L) whose norm down G{F) is the identity, for every intermediate field F with 
F^ L. 

The group Res^/feG is “almost the same” as the product G x Gg (there are 
homomorphisms between them with “small” kernel and cokernel). 

5.6 The Algebraic Torus T„ 

Let T„ (or Tn,q when it is important to keep track of the ground field) denote 
the primitive subgroup of ResF^„/F,Gm, i.e., 

= ker[ResF^„/F, ® d/F, • 

d\n “ 

d^n 

By definition, T„(Fg) is the group of elements of F^„ that have norm 1 down 
to every intermediate field F^d (for d ^ n). By Lemma 7 of [25], 

T„(Fg) = r„. 



( 2 ) 




34 



K. Rubin and A. Silverberg 



Example 8. Continuing Example 7, where <7 = 3 and n = 2, it is easy to write 
down embeddings: 

Gm ReSFa/Fa^m, X i— (a:, 0, X ^,0), 

T2 ReSFg/FsGm, Xi +a;2V^^ H> (xi,X2,Xi,-X2)- 
The compositions (in both orders) of the resulting map 



Gm X T2 — >■ ReSFg/FsGm 



with the map 
defined by 



ReSFg/FjGm — >■ Gm X T2 



{xi,X 2 ,yi,y 2 ) {xj + xl,xiyi + X 2 y 2 + 2 x 2 yi\/GT) 
are the squaring maps. Thus, ResFg/FgG^ is “almost the same” as G^ x T 2 . 



5.7 The Trace-0 Subgroup of Resp^/F^/ ) 

Abelian varieties are, by definition, projective algebraic groups. Elliptic curves 
are exactly the one-dimensional abelian varieties. 

With E~,q',q,£, and P as in §4.2, let 

R = ResFg/F^, (E ), 

and let A be the primitive subgroup of B: 

Nf /F , 

A = ker[R — E~]. 

Then A and B are abelian varieties over F^/ of dimensions 4 and 5, respectively, 
and B is isogenous to E~ x A. (See also §3.2 of [11].) The abelian variety A 
is simple. Since the group law on an abelian variety is written additively, the 
norm map now corresponds to the sum of the conjugates, i.e., the trace defined 
in §4.2. We have 

{P) C Aq = {Q € E~{¥q) : TrF^/F^, (Q) = O^-} — A(Fg/) 

n n 

R-(F,) -R(F,0 

Note that the underlying four-dimensional abelian variety A is invisible in 
the algorithms in §4.2. 



6 Cryptographic Applications of Algebraic Tori and 
Their Quotients 

We give an exposition of some of the mathematics underlying torus-based cryp- 
tography (i.e., the T„-cryptosystems) and the cryptosystems discussed in §2. We 
discuss how the latter schemes are based on quotients of tori by the actions of 
symmetric groups. 
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6.1 Algebraic Tori 

Definition 9. An algebraic torus is an algebraic group that over some larger 
field is a product of multiplicative groups. A field over which the torus becomes 
isomorphic to a product of multiplicative groups is called a splitting field for the 
torus; one says that the torus splits over that field. See [23,30] for expositions. 

Example 10. (i) For every positive integer r, is an r-dimensional algebraic 
torus. 

(ii) Resjr Gjn is an n-dimensional algebraic torus over that splits over 

By Proposition 2.6 of [26], the group T„ defined in §5.6 is a (^(n)-dimensional 
torus. 

6.2 Rationality and Birational Isomorphisms 

If r is a positive integer, write A*" for affine r-space. For any field F, we have 
A’’(E) = A’’, the direct sum of r copies of F. 

Definition 11. A rational map between algebraic varieties is a function defined 
by polynomials or quotients of polynomials that is defined almost everywhere. 
A birational isomorphism between algebraic varieties is a rational map that 
has a rational inverse (the maps are inverses wherever both are defined). A 
d-dimensional variety is rational if it is birationally isomorphic to hf. 

Note that birational isomorphisms are not necessarily group isomorphisms. 
Note also that rational maps are not necessarily functions — they might fail to 
be defined on a lower dimensional set. 

By (2), if T„ is rational (i.e., birationally isomorphic to then almost 

all elements of T„ can be represented by <p{n) elements of F^. 

The maps / and j in §3 are only birational. The sets T„ and Fg are of size 
approximately . The “bad” sets where / and j are not defined correspond to 
algebraic subvarieties of dimension at most ip{n) — 1, and therefore have at most 
elements for some constant c. Thus the probability that an element 
lands in the bad set is at worst c/q, which will be small for large q. In any given 
case the bad sets might be even smaller. For example, in §3.6 the bad sets have 
2 and 0 elements, respectively. 

6.3 Obtaining the Rational Maps f and j 

How were the maps in Examples 11 and 12 of [25] and in §3.6 above arrived at? 
The idea is as follows. 



F,6 
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The one-dimensional torus T2,q^ iSi by definition, the kernel of the norm map 
Nf g /F 3 • The torus 

T := ResF^3/F^(T2,g3) 

has dimension 3. As in §3.4, the torus T2,q3 is rational (i.e., is birationally iso- 
morphic to A^), and thus the torus T is rational (i.e., birationally isomorphic to 
A^). The two-dimensional torus Tg is the hypersurface cut out by the equation 
Np^g/F^z = 1 inside the torus T. This hypersurface is defined by a quadratic 
equation that can be used to parametrize the hypersurface. We gave examples 
of this in Examples 11 and 12 of [25]. Section 3.6 gives an additional example. 

6.4 A Group Action on the Torus 

Next, we define actions of symmetric groups on the tori T„. Suppose e is a divisor 
of n, and let d= n/e. Since n is square-free, we have gcd(e,(i) = 1, so 

Z/nZ = Z/eZ x 1 /d 1 . 

The symmetric group on e letters, 5e, acts on Z/eZ. Extend this action to an 
action of Se on Z/nZ, by acting trivially on ’L/d'L. Now define an action of Se 
on A" (= as follows. For tt G S'e, 

>->■ (a^Ti— i(i))iGZ/nZ- 

We have 

A” = ResF,j„/F,A^ D Resp,j„/F,Gm D T„. 

F,„ 

The action of Se on A” preserves Resp,j„/F,Gm. However, it does not necessarily 
preserve the torus T„. 

Theorem 12 (Lemma 3.5 of [26]) Ifp is a prime divisor ofn, then the above 
action of Sp on A” preserves the torus T„. 

6.5 Interpreting the Other Systems in Terms of Quotients of Tori 

— The Lucas-based cryptosystems are “based on” the quotient variety T2/S'2. 

— The Gong-Harn system is based on the quotient variety T3/53. 

— XTR is based on the quotient variety Tg/S'3. 

— Conjectural “Looking beyond XTR” systems would rely on the quotient 
variety T3o/(S'3 x S'5) or T3o/(S'2 x S3 x S3). 

These quotient varieties are not groups. This is why the Lucas-based systems 
and XTR do not do straightforward multiplication. 

— The T2-cryptosystem is based on the group (and torus) T2. 

— CEILIDH is based on the group (and torus) Tg. 
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— The (sometimes conjectural) T„-cryptosystems are based on the group (and 
torus) T„. 

We therefore call the T„-cryptosystems “torus-based cryptosystems” . 

What do we mean when we say that these systems are “based on” certain 
algebraic varieties? 

XTR works because the variety Tb/s's is rational, and the trace map F^e — >■ 
Fp 2 induces a birational isomorphism: 



Te/iSs ^ — ResF^a/F,-^^- 

Similarly for the Lucas-based cryptosystems, the trace map F ^2 — >• Fp induces 
a birational isomorphism: 



T2/S2 - - ^ A^. 

More precisely, let -B(d,e) denote the image of T„ in (ResF^„/F,Gm)/«5'e (where 
n = de). By Theorem 3.7 of [26], ??(£;, e) is birationally isomorphic to Tn/{Sp^ x 
••• X Sp^) where e = pi---pr is the prime factorization of e. Note that the 
quotient map T„ — >■ T„/S'e induces a (non-surjective) map on F^-points: 

T„ = T„(F,) ^ (T„/5e)(F,). 

Let 

XTR(d, e) = {TrF^„/F_^^(a) : a G T„} C F^d. 

When (d, e) = (1, 2) or (2, 3), then XTR(d, e) is the set of traces that occur in the 
Lucas-based systems and XTR, respectively. In these two cases, XTR(d, e) can 
be naturally identified with the image of T„(Fg) in (T„/S'e)(Fg). More precisely 
(see Theorem 13 of [25]), when (d,e) = (1,2) or (2,3), the trace map TrF^„/F ^ 
induces a birational embedding 

Tn / Se ^ ReSp^^/F^ A^ 

such that XTR(d, e) is the image of the composition 

Tn = T„(Fq) — > (T„/5'e)(Fq,) ^ (Resp^^ /F^ -^^ ) (i^ij) — F^d. 



6.6 “Looking beyond XTR” 

The paper “Looking beyond XTR” [6], building on a conjecture in [8], asks 
whether, for n > 6, some set of elementary symmetric polynomials can be used 
in place of the trace. In particular, [6] asks whether, when d \ n and d \ p{n), 
one can recover the values of all the elementary symmetric polynomials (i.e., the 
entire characteristic polynomial) for Gal(Fpn/Fpd) from the first (p{n)/d of them 
(this was already answered in the affirmative in some cases in [8,13]). If this 
were true, one could use the first p{n) jd elementary symmetric polynomials on 
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the set of Gal(Fpn/Fpd)-conjugates of an element /i G T„ to represent h by 9 ?(n) 
elements of Fg. More generally, [6] asks whether, for d \ n, one can recover the 
entire characteristic polynomial over F^d from its first |" f{n) /d~\ coefficients. 

The answer is no. In particular, in [25] we show that when n = 30 and p = 7, 
then: 

— for c? = 1, no 8 (= (p(n) /d) elementary symmetric polynomials determine 
any of the remaining ones (except those determined by the symmetry of the 
characteristic polynomial), 

— for d = 1, no 10 elementary symmetric polynomials determine all of them; 

— for d = 2, no 4 (= (p{n)/d) elementary symmetric polynomials determine all 
of them. 

Reinterpreted in terms of algebraic tori, the conjectures in [6] imply (see 
[26]) that the first eight elementary symmetric polynomials induce a birational 
isomorphism over Fp: 



T3o/(^2 XS 3 XS 5 ) - - - A®, 

and the first four elementary symmetric polynomials on the Gal(Fp3o/Fp2)- 
conjugates of an element in T^q induce a birational isomorphism over Fp: 

T3o/(>S'3 X S 5 ) ^ ReS]f^2 A'^ = A®. 

In [26] we prove that these statements are both false, for all but possibly finitely 
many primes p. 

More generally, we have 

T„ ^ ^ A^)® = A", 

where the middle map is induced by the e elementary symmetric poly- 

nomials si,... ,Se on Gal(Fgn/Fqd)-conjugacy classes. (Recall that S(d,e) was 
defined at the end of §6.5, and de = n.) 

The conjectures in [6] would imply that, when d divides <p{n), then the first 
Lp{n)/d functions si, . . . , s^(n)/d induce a birational isomorphism 

B(d,e) (ResF^,/F,Ai)‘^(”)/'^^ 

This is true when the pairs (d, e) are (1,1) (this is Diffie-Hellman) , (1,2) 
(Lucas-based systems), (1, 3) (Gong-Harn), and (2, 3) (XTR). It is also true (see 
[8]) when f is a prime and (d, e) = (l,f) or (2,^). As noted above, we showed in 
[25,26] that this is false for (d, e) = (1, 30) and (2, 15) (in all but at most finitely 
many characteristics). 

When (d, e) = (n, 1), the underlying variety R(d,e) is T„ itself, corresponding 
to the T„-cryptosystems. 

In summary, elementary symmetric polynomials are not the correct functions 
to use. In the next section we state a conjecture (of Voskresenskii) that seems 
to be closer to the truth. 
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6.7 Voskresenskii’s Conjecture 

Conjecture 13 (Voskresenskii) T„ is rational; i.e., for every n, there is a 
birational isomorphism 

T„ - - ^ 

The conjecture is true, and not difficult to prove, if n is a prime power [30]. 
The conjecture was proved by Klyachko [16] when n is a product of two prime 
powers. Explicit birational isomorphisms are given in §5 of [25] and §3.6 above 
(see also §3.4 above), in the cases n = 2 and 6. A T„-cryptosystem arises for 
every n for which Voskresenskii’s Conjecture is true with efficiently computable 
birational maps. 

When n is divisible by more than two distinct primes, Voskresenskii’s Con- 
jecture is still an open question. In particular, the conjecture is not known when 
n = 30 = 2 • 3 • 5. We have tried unsuccessfully to construct a birational isomor- 
phism between T30 and A®. It would be interesting to know whether Voskresen- 
skii’s Conjecture is true or false when n = 30. We have been able to construct 
explicit rational maps of low degree in this case, which might be useful if no 
birational map exists. For example, an s-to-1 map from T30 to A® would provide 
a lossy compression scheme, and would allow one to represent elements of T30 
in F® X {!,... ,s}. 

Rationality of the varieties B{l,n) (or more generally the varieties B{d,e)) 
would imply the conjecture in [8]. 

6.8 Stable Rationality 

One reason that Voskresenskii’s Conjecture would be difficult to disprove is that 
the tori T„ are known to always be stably rational over (see the Corollary on 
p. 61 of [30]). 

Definition 14. A variety V over k is called stably rational over k if for some r 
and s, V X A*" is birationally isomorphic over k to A'* (i.e., V x A*" is rational for 
some r > 0). 

Although the stable rationality of T„ does not allow one to represent elements 
of T„ in Fg^"\ it does allow one to represent elements of T„ xFJ in F^ for suitable 
r and s, and this might be useful. 

7 Open Problems 

Some goals for the future are: 

~ Improve the efficiency of CEILIDH. 

— Obtain more efficient key agreement, encryption, and signature schemes, by 
generalizing to T3o-cryptosystems: 

• find explicit and efficient birational isomorphisms / and j between T30 
and A®, if such exist. 
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• look for special attacks on the discrete log problem in F^ 3 o . 

— Use non-supersingular (i.e., ordinary) abelian varieties to further improve 
pairing-based cryptography. 

Progress has been made on the last point in the case of elliptic curves; see 
for example [7]. 



Acknowledgments. The authors thank Dan Bernstein, Steven Galbraith, and 
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Abstract. For r = 6, 7, . . . , 11 we find an elliptic cnrve Fl/Q of rank 
at least r and the smallest conductor known, improving on the previons 
records by factors ranging from 1.0136 (for r = 6) to over 100 (for r = 10 
and r = 11). We describe onr search methods, and tabulate, for each 
r = 5,6, ...,11, the five curves of lowest conductor, and (except for 
r = 11) also the five of lowest absolnte discriminant, that we found. 



1 Introduction and Motivation 

An elliptic curve over the rationals is a curve E of genus 1, defined over Q, 
together with a Q-rational point. A theorem of Mordell [23] states that the 
rational points on E form a finitely generated abelian group under a natural 
group law. The rank of E is the rank of the free part of this group. Currently 
there is no general unconditional algorithm to compute the rank. Elliptic curves 
of large rank are hard to find; the current record is a curve of rank at least 24 
(see [16]).^ 

We investigate a slightly different question: instead of seeking curves of large 
rank, we fix a small rank r (here 5 < r < 11) and try to make the conductor N 
as small as possible, which, due to the functional equation for the L-function 
of the elliptic curve, is more natural than trying to minimize the absolute dis- 
criminant |Z\|. The question of how fast the rank can grow as a function of N 
has generated renewed interest lately, partially due to the predictions made by 
random matrix theory about ^-function analogues [8]. However, there are at 
present two different conjectures, one that comes from a function field analogue, 
and another from analytic number theory considerations. We shall return to this 
in Section 5. 

* Supported in part by NSF grant DMS-0200687. 
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Scholar with the MAGMA Computer Algebra Group at the University of Sydney 
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^ Gonrey wrote of a curve of rank 26 [7, p. 353], but confirms in e-mail to the authors 
that “26” was a typographical error for “24” as in [16]. 
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We try to find -E/Q of high rank and low conductor by searching for elliptic 
curves that have many integral points. As stated, this strategy is ill-posed, as 
integrality of points is not invariant under change of model (defining equation). 
However, if we only consider (say) Neron models then the question makes sense, 
and a conjecture of Lang [15, p. 140] links the number of integral points to the 
rank, at least in one direction. More explicitly, one might conjecture that there 
is an absolute constant C such that the number of integral points on an elliptic 
curve E of rank r is bounded by The best result is due to Silverman [27], 

who shows the conjecture is true when the j-invariant j{E) is integral, and in 
fact proves that for every number field K there is a constant Ck such that the 
number of S'-integral points over K is bounded by ^ where 5 is the 

number of primes of K at which j{E) is nonintegral. Explicit constants appear 
in [13]. Szpiro’s conjecture [31], which is equivalent to the Masser-Oesterle ABC 
conjecture [24], states that A <C Hindry and Silverman [14] show this 

implies that the number of S'-integral points on a quasi-minimal model of E/ K 

is bounded by where <Jeik is the Szpiro ratio, which is the ratio 

of the logarithms of the norms of the discriminant and the conductor of EjK. 
Finally, Abramovich [1] has shown that the Lang-Vojta conjecture (which states 
that the integral points on a quasi-projective variety of log general type are not 
Zariski dense, see [35, 4.4]) implies the uniform boundedness of the number of 
integral points on rational semistable elliptic curves, but the lack of control over 
the Zariski closure of the integral points makes this result ineffective. 

Conversely, it is frequently the case that elliptic curves of high rank, and 
especially those with relatively small conductor, have many integral points, and 
thus our search method is likely to find these curves. In fact, for each r in our 
range 5 < r < 11 we found a curve E of rank at least r whose conductor N 
is the smallest known. For r = 5 this was a previously known (see [5]) curve 
with N = 19047851. For the other r our curve is new, with N smaller than the 
previous record by a factor ranging from 1.0136 for r = 6 to over 100 for r = 10 
and r = 11. As a byproduct we also find the curves of rank r whose discriminants 
A have the smallest absolute values known. We estimate that finding a similarly 
good rank 12 curve would take 20-25 times as much work as for rank 11. 

Since rational elliptic curves are modular [36,32,9,6,3], the tables of Cre- 
mona [10] are complete for N < 20000. Hence the lowest conductors for ranks 
0-3 are respectively 11, 37, 389, and 5077. The rank 4 record was found by 
McConnell and appears in his Maple package APECS [17]; the curve has 
[tti, a 2 , as, a 4 , ae] = [1, —1, 0, —79, 289] (see Section 2.1 for notation) and its con- 
ductor of 234446 is more than twice as small as the best example in [5]. Stein, 
Jorza, and Balakrishnan have verified [28] that there is no rank 4 elliptic curve 
of prime conductor less than 234446. 

The rest of this paper is organized as follows. In the next section we describe 
the methods we used to search efficiently for curves with many small integral 
points. We then report on the curves of low conductor and/or absolute discrim- 
inant that we found, and compare them with previous records. The next section 
reports on our computation of further integral points on each of these record 
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curves and on many others found in our search. Finally we compare our nu- 
merical results with previous speculations on the growth of the minimal N as a, 
function of r. 



2 Algorithms 

We describe two algorithms that each find elliptic curves with numerous integral 
points whose x-coordinates have small absolute value. The input to our algo- 
rithms is an ordered triple (/i,/, 62) where /i is a height parameter, / is a lower 
bound on the number of integral points we want, and 62 G {— 4 , — 3 , 0, 1, 4 , 5 }, 
these being the possible values of 62 = a? + 4o2 for an elliptic curve in min- 
imal Weierstrass form (see below). We then try to find elliptic curves E with 
an equation = 4 x^ -I- 622;^ -I- 264a; -I- 65 such that there are at least / integral 
points on E with 0 < y < 2 h^, |cc| < h'^, and |2&4| < In modifications of 
the algorithm, we use variants of these bounds, and in general only have a high 
probability of finding the desired curves. 



2.1 First Algorithm 

An elliptic curve E/Q can be written in its minimal Weierstrass form as -I- 
aiXY + azY = + 02^^ -I- a^X + qq, where ai and 03 are 0 or 1 and |a2| < 1 . 

We can obtain the “ 2 -torsion” equation^ 

+ b 2 X^ + 264X -I- be 

by completing the square via y = 2 Y + a\X + 03 and a; = A, so that we get 
62 = a\+ 4 a 2 , 64 = 0403-1-204, and 63 = o§-|- 4 o 6 . Note that this transformation 
preserves integral points; we use the 2-torsion equation rather than the minimal 
equation since it is relatively fast to check whether its right-hand side is square. 
Fixing a choice of 62 G {~ 4 , — 3 , 0 , 1 , 4 , 5 } and a height-bound h, we search for 
curves with integral points by looping over the coordinates of such points. In 
particular, we first fix a 64-value with [264 1 < 46 ,'^ and then loop over integral 
values of x and y with |a:| < and 0 < y < 26 ,^, and finally calculate the value 
of be from the above equation, counting how often each 6g-value occurs. Note 
that the above bounds imply that |y^|, | 4 x^|, and |264x| are all bounded by 4 /i®. 

This algorithm takes on the order of 6.® time, with memory requirements 
around for the recording of the 6g- values. There are various methods of speed- 
ing this up. We can note that neither positive 64 nor negative 6g are likely to give 
curves with many integral points, due to the shape of the cubic. From Table 1 
we see that 64 cannot be odd when be is even. Also, we know that be is a square 
modulo 4 . We can extend this idea to probabilistic considerations; for instance, 
a curve with 62 = 1 is not that likely to have numerous integral points unless 

^ The second-named author suggests this term because such a model makes it easy to 
locate the 2-torsion points on E: (x,y) G F[2] if and only liy = Q. 
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64 is odd and be is 1 mod 8. We ran this algorithm for h = 20, and an analy- 
sis showed that the congruence restrictions most likely to produce good curves 
had (62, bi, be) mod 8 equal to one of (1, 1, 1), (1, 3, 1), (5, 2, 4), (5, 0, 0), (0, 2, 1), 
(0,0,0), or (4,0, 1). Of course, there are curves that have many integral points 
yet fail such congruence restrictions, but the percentage of such is rather low 
(only 10-20%), and even those that do have numerous integral points appear 
less likely to have high rank. However, our table of records does contain some 
curves that fail these congruence restrictions, so there is some loss in making 
them. With these congruence restrictions, our computation took 15-20 hours 
on an Athlon MP 1600 to handle one 62-value for h = 20; with no congruence 
restrictions, this would be about 5 days. Note that our congruence restrictions 
imply that the trials for 62 = ±4 should only take half as long as the others. 
With this algorithm, we broke the low-conductor records of Tom Womack (from 
whose work this sieve search was adapted) for ranks 6, 7, and 8 (see Tables 2 
and 3).^ 



Table 1. Congruence relations with ai and as 



ai 


as 


64 


be 


X and y 


0 


0 


even 


even 


y even 


0 


1 


even 


odd 


y odd 


1 


0 


even 


even 


III 

H 


1 


1 


odd 


odd 


y^x {2) 



2.2 Second Algorithm 

The number of elliptic curves E with 64 <C and be <C 6® grows as 6^®. The 
typical such curve has no small integral points at all: as we have seen, the number 
of (E,P), with E as above and P G E(Q) a small integral point, grows only 
as hp , as does the time it takes to find all these (E,P). But we expect that 
even in this smaller set the typical E does not interest us, because it has no 
integral points other than ±P. We shall see that there are (up to at most a 
logarithmic factor) only 0{h^) curves E in this range together with a pair of 
integral points P, P' such that P' yf ±P, and that again we can find all such 
{E, P, P') with given 62, 64 in essentially constant time per curve. We thus gain 
a factor of almost h compared to our first algorithm.^ Further improvements 
might be available by searching for elliptic curves with three or more points, but 
we do not know how to do this with the same time and space efficiency. 

We wish to compute all 64, 6g, a:i, t/i, X2, j/2 in given ranges that satisfy the 
pair of equations yj = 4x® -I- 622;^ -I- 2b4Xj + be {j = 1,2). Subtracting these two 

® In our tables the stated value of the “rank” is actually the rank of the subgroup 
generated by small integral points on the curve, which is very likely to be the actual 
rank, though in general such results can be quite difficult to prove. 

^ This pair-finding idea is also used in [11] to find curves = k oi high rank. 
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equations, we find that 

(2/2 - Ui){y 2 + yi) = {X2 - Xi)[2b4 + b 2 {x 2 + Xi) + A{x\ + X 1 X 2 + xl)]. 

We can thus write 

X2 — xi = rt and 2/2 — J/i = rs and 2/2 + 2/i = 

for some integers r, s, t, u. From the latter two equations, we see that we need 
rs and tu to have the same parity in order for the y’s to be integral. Our ex- 
pectation is that generically we shall have r, 2 <C and s,m <C when the 
a;- values are bounded by and the j/- values by . It is unclear how often this 
expectation is met. One way of estimating the proportion is to consider pairs 
of points (xi, 2/1), (3^2, 2/2) with \xi\ < h? on various curves and see what values 
of (r,s,t,u) are obtained. This is not quite well-defined from the above; for in- 
stance, the quadruple (a^i, 2:2, 2/1, 2/2) = ( 7 , 3 , 6 , 2 ) could have (r,s,t,u) as either 
( 4 , 1 , 1 , 8 ) or ( 2 , 2 , 2 , 4 ). However, it becomes well-defined upon imposing the ad- 
ditional condition that r = gcd(2/2 — yi,X2 — x\). Experiments show that about 
18 % of the (r, s, 2, u) obtained from this process satisfy 1 < r, 2 < 2i, though the 
exact percentage can vary significantly with the curve. Note that swapping r 
and 2 or negating either leads either to a switching of {x\,yi) and (^2,2/2) or to 
a negation of 2/-values. Thus we can assume that 1 < r < 2 . 

We rewrite the above equation in the form 

rstu = r2[2&4 -|- 62(a;i -I- X2) + 3 (xi -I- X2Y + (a^i ~ X2Y] 

and define z = X\ + X 2 so that su = 2b4 + b 2 Z + iz^ + {rtY . Our algorithm is now 
the following. Given one of the six possible values of &2, we loop over 264- values 
between — 42 i'^ and 0 (implementing our above comment that positive 64-values 
are not that likely to give curves with many integral points) . For each value of 64 
we loop over pairs of integers (r, 2 ) that satisfy 1 < r < 2 < 2 i. We then compute 
I = rt and loop over values of 2x2 (that is, z + V) with — 26 ^ < 2x2 < 2h? . 
Next we compute the quantity W = 264 -I- 62 z -I- ( 2 ^ -I- 3 z^) and factor this in all 
possible ways as IF = su. We then take 7/2 = (rs + tu) /2 (assuming that rs and 
tu have the same parity) and compute be = y^ — 4 x| — &2x| — 264X2. As before, 
we record the 65- values and count how many times each occurs. This algorithm 
takes about 6® log h time, where the logarithmic factor comes from solutions of 
IF = su, assuming we can find these relatively fast via a lookup table. Already 
at 6 = 20, a version of this algorithm ran in under an hour and found most of 
the curves found by the first algorithm. One can view this algorithm as looping 
over pairs of x- values (both of size 6^), or more precisely the sum (given by z) 
and difference (given by 1 ) of such a pair, and then reconstructing the y-values 
by factoring. Thus the inner loop takes time h'^ log h instead of 6.® as in the first 
algorithm. 

2.3 Implementation Tricks 

We now describe the various tricks we used in the implementation. We shall 
see that our 6-congruence restrictions allow us to limit the z and I values in 
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a productive way. First we consider the cases where 62 is odd. Given a fixed 
64-value we only loop over z’s and Vs that are both odd, and can note that this 
makes W odd. Actually we do not loop over I but determine it as / = rt; thus we 
are looping over odd r and t with 1 < r < t < h. It may seem that this loses a 
factor of 4 of (r, s, t, u) quadruples (with the z-restriction losing nothing because 
z must be of the same parity as rt), but we claim that it is actually only a factor 
of 2 for “interesting” curves. Indeed, though our yield of 6g- values will drop by a 
factor of 4 because of this parity restriction on both r and t, many of these values 
of 6g will correspond to curves on which all integral points have ^-coordinates of 
the same parity. Since I is the difference of two x-coordinates, this implies that 
I must be even for all pairs of integral points. These curves, which are plainly 
less likely to have a large number of integral points, are over-represented in the 
curves we ignore through not considering even 1. From this we get our heuristic 
assertion that restricting to odd I loses only a factor of about 2. 

We also consider only the values of z for which \W\ is less than a cer- 
tain bound. This serves a dual purpose in that it speeds up the algorithm 
and also reduces the size of the tables used for factoring. We see that W = 
264 -I- b 2 Z + {P + 3z^) should be of size 6^, and so we restrict the size of W 
via the inequality |IF| < 2h^/U, where C/ is a parameter we can vary (we had 
U = I for the experiments with /i = 30 and h = 40). Again it is not immedi- 
ately clear how many (r, s, t, u) quadruples we miss by making this restriction 
on W, and again the proportion can depend significantly upon the curve (curves 
with 64 near —2h^ lead to more quadruples with large |IF| than those with 64 
close to 0) . Experimentation showed that with C/ = 1 we catch on average about 
83% of the relevant 6e-values under this restriction. Our expectation might be 
approximate inverse linearity of the catch rate in U , though only in the limit 
as [/ — >■ 00. Experimentation showed that with U = 8 our catch rate is down 
to 27%, while at C/ = 32 it is about 10%. However, there is interdependence 
between this restriction and that on the size of r and t — when r and t are both 
small, this corresponds to a small x-difference, which implies a small j/-difference, 
and so LF = SM should also be diminished in size. In a final accounting of the 
proportion of (r, s, t, u) quadruples, including the loss of a factor of 2 from the 
parity restriction on I, we find that with C/ = 1 we catch 7.4% of the quadruples, 
with [/ = 8 we catch 2.5%, and with [7 = 32 we catch just under 1%. Most of the 
curves of interest to us have at least 40 integral points within the given bounds 
\x\ < 6^ and 0 < y < 2h?, and thus have at least 780 pairs of integral points. So 
missing 93% or more of the (r, s, t, u) quadruples does not trouble us — indeed, 
our “laziness” in not finding all the possible (r, s,t,u) is quadratically efficient 
compared to what we would achieve via similar “laziness” in our first algorithm. 

So far we assumed that 62 was odd, but similar ideas apply also for even values 
of 62. When 62 = ±4, we took I and z to be even but not congruent modulo 4. 
This ensures that IF is 4 mod 8. Similarly, when 62 = 0 and 65 is odd, we take I 
and z to be even and congruent modulo 4, again ensuring that IF is 4 mod 8. To 
implement these restrictions, we took r = 2 mod 4 with no restriction on t other 
than t>r\it itself is also 2 mod 4 (with both variables less than h as before) . 
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Again we required that \W\ < 2h^/U, and here we have various restrictions on 
the decomposition W = su depending on t mod 4. Specifically, we can always 
take s odd and 4\u, we can take both s and u even if t is odd, and we can take 4|s 
and u odd if t is 2 mod 4, as we need for y to be odd in these cases. When &2 = 0 
and &6 is even, we take z and I both to be odd, which makes W be 4 mod 8, 
and we need both s and u to be even (hence each 2 mod 4) for y to be even. As 
above, the loss in the number of interesting 6g- values from these restrictions is 
not much more than a factor of 2. 

2.4 More Tricks 

To reduce the memory needed for the counting of be values, we used the following 
idea. We create a array of 2^ counters (of size 16 bits each); for instance, for 
ft, = 30 we used L = 19. Then for each ftg-value we obtain from above, we reduce 
[fog/Sj modulo 2^, and increment the corresponding counter. In other words, we 
only record &g modulo 2^+^. At the end of the loops over r, t, and 2, we extract 
the counters with at least 10 hits. These residue classes are then passed to a 
secondary test phase. Here we set up counters for the values of ftg with 0 < ftg < 
4ft® that are in the desired residue class bf modulo 2^+®. We then run through 
integral x with |x| < ft^, and for each x- value determine the corresponding 
positive y-values such that = 4x® + 622;^ + 2 b 4 X + bf mod 2^+® via a lookup 
table of square roots modulo 2^+®. Most of these y-values exceed 2ft®, and we 
thus ignore them. If not, we compute be exactly from x and y, and increment 
the corresponding counter. After running over all the a:-values, we then check 
for large counter values. By taking 2^ somewhere around ft® (note that this is 
about how many &g-values we generate), we can use this method to handle a 
ftg-congruence-class in essentially ft^ time. This is generically small compared 
to the ft® time for the loops over r, t, and z; when ft = 30 we averaged about 
100 congruence classes checked for each 64-value, but the time for the loops still 
dominated. 

3 Experimental Results 

We ran this algorithm with ft = 30 and U = I with a few more congruence 
classes in consideration, taking about a day for each (62,&4,6g) class. We then 
proceeded to run it for ft = 40 and U = 1, and then ft = 60 and U = 8, taking 
a few weeks for each (62, 64, ftg) class. Other runs were done with the “better” 
congruence restrictions of (62,64,65) with varied parameters up to ft = 90 and 
U = 48. Though with the U = 48 restriction we are catching less than 1% of the 
(r, s, t, u) quadruples, by this time we expect that most interesting curves have 
60 or more integral points with |x| < ft^ and 0 < y < 2ft®; indeed, even with the 
ft = 60 search all the record curves we found had at least 70 integral points in 
this range. 

Table 2 lists minimal equations for each of the five curves of smallest con- 
ductor N for each rank from 5-11 that were found by the above method. Table 
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Table 2. Low conductor records for ranks 5-11 



[ai, 02 , as, 0,4, oe] 


N 


\A\/N 


I 


r 


[0,0,1,-79,342] 


19047851 


1 


39 


5 


[1,0,0,-22,219] 


20384311 


1 


29 


5 


[O, 0,1,-247, 1476] 


22966597 


1 


40 


5 


[1,-1,0,-415,3481] 


34672310 


10 


52 


5 


[0,0,0,-532,4420] 


37396136 


32 


52 


5 


[1,1,0,-2582,48720] 


5187563742 


6 


71 


6 


[0,0,1,-7077,235516] 


5258110041 


243 


67 


6 


[1,-1,0,-2326,43456] 


5739520802 


2 


60 


6 


[1,-1,0,-16249,799549] 


6601024978 


184 


68 


6 


[1,-1,1,-63147,6081915] 


6663562874 


32768 


88 


6 


[0,0,0,-10012,346900] 


382623908456 


32 


101 


7 


[1,0,1,-14733,694232] 


536670340706 


8 


77 


7 


[0,0,1,-36673,2704878] 


814434447535 


5 


84 


7 


[1,-1,0,-92656,10865908] 


858426129202 


142 


92 


7 


[1,-1,0,-18664,958204] 


896913586322 


26 


109 


7 


[1,-1, 0, -106384, 13075804] 


249649566346838 


14 


124 


8 


[1,-1, 0, -222751, 40537273] 


292246301470558 


2 


101 


8 


[0,0,0, -481663,128212738] 


314214346667560 


160 


141 


8 


[1,-1,0,-71899,5522449] 


314658846776578 


34 


130 


8 


[1,-1, 0, -124294, 14418784] 


315734078239402 


106 


131 


8 


[1,-1,0,-135004,97151644] 


32107342006814614 


122 


191 


9 


[1,-1, 0, -613069, 98885089] 


43537345103385386 


242 


203 


9 


[0, 0, 1, -3835819, 2889890730] 


62986816173592807 


67 


142 


9 


[1, 0, 1, -1493028, 701820182] 


72070075910145406 


2 


139 


9 


[1, 0, 1, -1076185, 496031340] 


77211251506212554 


344 


156 


9 


[0, 0, 1, -16312387, 25970162646] 


10189285026863130793 


1331 


262 


10 


[1,-1, 0, -10194109, 12647638369] 


22006161865320788846 


58 


241 


10 


[0, 0, 1, -21078967, 35688990786] 


22630148490190627609 


2173 


238 


10 


[1,-1, 0, -1536664, 648294124] 


25440555737235843986 


2 


207 


10 


[1,-1, 0, -4513546, 3716615296] 


39432942782223365758 


2 


179 


10 


[0, 0, 1, -16359067, 26274178986] 


18031737725935636520843 


1 


229 


11 


[1,-1, 0, -38099014, 115877816224] 


66484354768372183177742 


34 


281 


11 


[1,-1, 0, -41032399, 106082399089] 


219576020293485812169274 


2 


236 


11 


[1,-1, 0, -34125664, 69523358164] 


227946110025657660240686 


2 


215 


11 


[1, -1, 0, -56880994, 168642718624] 


252948166615918192888894 


2 


235 


11 



Table 3. Value of log V for old and new rank records 





6 


7 


8 


9 


10 


11 


Old 

New 


22.383 

22.370 


27.703 

26.670 


33.962 

33.151 


40.721 

38.008 


49.033 

43.768 


55.852 

51.246 



4 lists similar data for smallest absolute discriminant |Z\|. The rank 5 data agree 
with the data from the Elliptic Curve Database [29]. The / column gives how 
many a;-coordinates of integral points we found (see Section 4) for the given 
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equation. Some of the curves fail our congruence conditions on ( 62 , 64 , 66 )) but 
they still can be found via a non-minimal model; indeed, letting C 4 and cg be 
the invariants of the minimal model, the model with invariants 12 "^ C 4 and 12®C6 
has 62 = 0 and 4|64 and 8 | 6 e and is thus in the (0, 0, 0) class. In this way, from 
(62,64,65) = (0,-1826496,2637633024) we recover the curve [1,0,0,-22,219]. 



Table 4. Low absolute discriminant records for ranks 5-10 



[ai, fl2, as, 04, ae] 


1^1 


7 


r 


[0,0,1,-79,342] 


19047851 


39 


5 


[1,0,0,-22,219] 


20384311 


29 


5 


[0,0,1,-247, 1476] 


22966597 


40 


5 


[0,1,1,-100,110] 


55726757 


33 


5 


[0,0,1,-139,732] 


59754491 


32 


5 


[1,0,0,-9227,340354] 


6822208199 


36 


6 


[0,0,1,-277,4566] 


7647224363 


49 


6 


[0,0,1,-379,5172] 


8072781371 


51 


6 


[0,0,1,-889,9150] 


8796007189 


54 


6 


[0, 1, 1, -390, 5460] 


9694585723 


43 


6 


[0,0,1,-1387,68046] 


1829517077483 


71 


7 


[0,0,1,-5707,151416] 


1991659717477 


68 


7 


[1,0,1,-5983,164022] 


2010552189452 


72 


7 


[1,0,1,-14505,667472] 


2132568452204 


71 


7 


[0,0,1,-15577,744876] 


2206378706437 


71 


7 


[0, 1, 1, -23846, 1022562] 


409086620841461 


78 


8 


[0,0,1,-23737,960366] 


457532830151317 


96 


8 


[0,1,1,-16440,1394010] 


561715239383323 


84 


8 


[1,-1, 0, -222751, 40537273] 


584492602941116 


101 


8 


[1,-1,0,-201814,34925104] 


643509175703572 


109 


8 


[0,0,1,-167419,30261330] 


95276302704064331 


135 


9 


[1, 0, 1, -1493028, 701820182] 


144140151820290812 


139 


9 


[0,0,1,-514507,140806716] 


151673348057775877 


126 


9 


[0,0,1,-402157,96291336] 


157107745029925477 


131 


9 


[0,0,1,-826609,289956150] 


172539371946838571 


120 


9 


[1,-1, 0, -1536664, 648294124] 


50881111474471687972 


207 


10 


[0, 0, 1, -1788817, 843180666] 


59202439687694448757 


176 


10 


[1, -1, 0, -4513546, 3716615296] 


78865885564446731516 


179 


10 


[0, 1, 1, -1856500, 1072474760] 


87950374485438204043 


154 


10 


[0, 0, 1, -2438527, 1545098346] 


103294665688000244363 


173 


10 



How good is this method at finding elliptic curves of low conductor JV and rel- 
atively high rank? Obviously if such a curve has few integral points then we will 
not find it. Indeed, it was suggested to us by J. Silverman that for large ranks r 
the smallest conductor curve might not have r independent integral points. How- 
ever, for the ranks we consider there are sufficiently many independent integral 
points; the same is true for Mestre’s rank 15 curve [21], but apparently not for 
later rank records. 
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Note also that our search operates by increasing 64 and be corresponding to 
some height parameter, which is not exactly the same as simply increasing the 
absolute value |Z\| of the discriminant, which again is not quite the same as just 
increasing N. Finally, the probabilistic nature of our algorithm and the necessity 
of restricting to “likely” congruence classes also cast doubt on the exhaustiveness 
of our search procedure. However, we are still fairly certain that the curves we 
found for ranks 5-8 are indeed the actual smallest conductor curves for those 
ranks. Note that our methods were almost exhaustive in the region of interest 
{h up to about 30), and were verified with the first algorithm in much of this 
range. For rank 9 we could be missing some curves with large |Z\|/fV and h around 
50 or so; perhaps this range should be rechecked with a smaller {/-parameter. 
Indeed, for a long time the second curve on the r = 10 list, which we found with 
an /i = 60 search, was our rank 10 record, but then a run with h = 80 found the 
first and third curves which have /i-values of about 64 and 68 respectively. We 
have yet to find many rank 11 curves with large |Z\|/A^; note that our current 
record curve in fact has prime conductor. This suggests that there still could 
be significant gains here. However, Table 3, which lists values of logiV for the 
old records and our new ones, indicates that our method has already shown its 
usefulness. There does not seem to have been any public compilation of such 
records before Womack [37] did so on his website in the year 2000, soon after he 
had found the records for ranks 6-8 via a sieve-search.^ The records for ranks 
9-10 were again due to Womack but from a Mestre-style construction [20], with 
Mestre listing the rank 11 record in [19] (presumably found by the methods 
of [18]). There was no particular reason to expect the old records for ranks 9- 
11 to be anywhere near the true minima, as they were constructed without a 
concentrated attempt to make the conductor or any related quantity as small as 
possible. 

4 Counting Integral Points 

Since we have curves that have many integral points of small height, it is natural 
to ask how many integral points these curves have overall, with no size restriction. 
For our curves of rank higher than 8, current methods, as described in [30], do 
not yet make it routine to list all the integral points and to prove that the list 
is complete. Indeed, even verification that the rank is actually what it seems to 
be is not necessarily routine. 

However, we have at least two ways to search for integral points and thus 
obtain at least a lower bound for the number of integral points. One method is 
a simple sieve-assisted search, which can reach x-values up to 10^^ in just over 
an hour on an Athlon MP 1600. The other method is to write down a linearly 
independent set from the points we have, and then take small linear combinations 

® Prior to Womack, there were records listed in the Edinburgh dissertation of Nigel 
Suess (2000); it appears that Womack and Suess enumerated these lists in part to 
help dispense Cremona from answering emails about the records. Indeed, the rank 4 
record of McConnell [17] mentioned above was relatively unknown for quite a while. 
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of these.® For this, we took the linearly independent set that maximized the 
minimal eigenvalue of the height-pairing matrix (as in the “ci-optimal basis” 
of [30]), subject to the condition that the set must generate (as a subgroup of 
E{Q)) all the integral points in our list. We then tried all {{2m + I)” — l)/2 
relevant linear combinations with coefficients bounded in absolute value by m.^ 
With r = 11 and m = 3 this takes about an hour. 

The maximal number of integral points we found on a curve was 281 x 2 
for [1, —1, 0, —38099014, 115877816224]. This can be compared with the rank 14 
curve [0, 0, 1, —2248232106757, 1329472091379662406] that is listed by Mestre 
[19], which we find has 311 x 2 integral points with |a;| < 10^^, plus at least 7x2 
more that were found with linear combinations as above. Note that amongst 
the curves of a given rank there is not much correspondence between number of 
integral points and smallness of conductor. For instance, we have no idea which 
curve of rank 7 has the maximal number of integral points;® trying a few curves 
found in our search turned up [1,-1,0,-22221159,40791791609] which has at 
least 165 x 2 integral points, but conductor 13077343449126, more than 34 times 
larger than our record. Note that = 2 • 3'^11^23^ for this curve; in general, 

large values of seem to correlate with large counts of integral points (see 

Table 2). 

5 Growth of Maximal Rank as a Function of Conductor 

We review two different heuristics and conjectures for the growth of the maxi- 
mal rank of an elliptic curve as a function of its conductor, and then indicate 
which is more likely according to our data. The first conjecture is due to Murty 
and appears in the appendix to [25]. He first notes that, similar to a heuristic 
of Montgomery [22, pp. 512-513] regarding the (^-function, it is plausible that 
argL£;(l -I- it) <C A/log(M)/loglog(M) as t — >■ oo. Murty speculatively applies 
this bound in a small circle of radius 1/loglogiV about s = 1. He then claims 
that Jensen’s Theorem implies that the order of vanishing of L£;(s) at s = 1 
is bounded by C y/log N/ log log N, though we cannot follow the argument. As- 
suming the Birch-Swinnerton-Dyer conjecture [2], the same upper bound holds 
for the rank of the elliptic curve. However, the Montgomery heuristic comes from 
taking the approximation logC(s) = + Oa{l) for a > 1/2 and as- 

suming that the act like random variables; upon taking a limit as a ^ 1/2, 
this implies the asserted bound of ^\ogt/\og logt, but only for large t. Indeed, 
in our elliptic curve case with small t, we should have an approximation (see 

® The possible size of coefficients in such linear combinations can be bounded via 
elliptic logarithms (possibly p-adic) as in [30] and later works. Also, as indicated by 
Zagier [38], one can combine elliptic logarithms with lattice reduction to search for 
large integral points, but we did not do this. 

We do not compute sums of points on elliptic curves directly over the rationals, but 
instead work modulo a few small primes and use the Chinese Remainder Theorem. 
® The maximal count of integral points may well be attained by a curve with nontrivial 
torsion, whose discriminant would then be too large to be found by our search. 
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[12]) more like log Le{s) ~ ^p<x ^p/P unclear whether the variation 

of the Op or that of should have the greater impact. Finally, Conrey and 
Gonek [8] contest that Montgomery’s heuristic could be misleading; they sug- 
gest that log |C(l/2 + it)\ (and maybe analogously the argument) can be as big 
as C log t/ log log t instead of the square root of this. One idea is that the above 
limit as cr — >■ 1/2 disregards a possibly larger error term coming from zeros 
of C(s); the asymmetry of upper and lower bounds for log |C(l/2 + it)\ makes 
the analysis delicate. A classic paper of Shafarevich and Tate [26] shows that 
in function fields the rank grows at least as fast as the analogue of 2 iog1<^ n ■ 
However, the curves used in this construction were isotrivial, and thus fairly sus- 
pect for evidence toward a conjecture over number fields. Ulmer recently gave 
non-isotrivial function field examples with this growth rate, and conjectured [33, 
Conjecture 10.5] that this should be the proper rate of growth even in the num- 
ber field case, albeit possibly with a different constant. In a different paper [34, 
p. 19], Ulmer notes that certain random matrix models suggest that the growth 
rate is as in the function field case; presumably this is an elliptic curve analogue 
of the work of [8] . 

Figure 1 plots the rank r versus log TV/ log log IV, where N is the smallest 
known conductor for an elliptic curve of rank r. A log-regression gives us that 
the best-fit exponent is 0.975, much closer to the exponent of 1.0 of Ulmer than 




Fig. 1. Plot of rank versus log A/ log log A 
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to the 0.5 of Murty. Note that an improvement in the records for ranks 9-11 
would most likely increase the best-fit exponent. 

Assuming the growth is linear, the line of best fit is approximately r = 
0.865 — 0.126. But this could mislead; GRH plus BSD implies 

^ 1 log IV / logSe / 1 

- 2 log log iV log log iV V (log log iV )2 

The main term in the above already appears in Corollary 2.11 of [4] (see also 
Proposition 6.11), and we have simply calculated the next term in the expansion. 
To get more reliable data, we would need to consider curves with log log N rather 
large, which is of course quite difficult. 

Finally we mention a possible heuristic refinement of the above upper bound. 
The bound comes from a use of the Weil explicit formula (see [4, 2.11]) to obtain 
the relation ^{'yloglog N) ~ 2 loglog n ’ "''^here h{t) = and the sum 

is over imaginary parts of nontrivial zeros of Le{s), counted with multiplicity. 
When only the high-order zero at 7 = 0 contributes, we get the stated upper 
bound. In the function field case, the other zeros contribute little because they 
are all near the minima of h. This is unlikely to occur in the number field case. 
Also unlikely is the idea that the other zeros have negligible contribution due 
to the 1/t^ decay of h. Thus the other zeros are likely to have some impact; 
however, it is not clear how large this impact will be. 
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Abstract. On the lines of the binary gcd algorithm for rational integers, 
algorithms for computing the gcd are presented for the ring of integers in 
Q{y/d) where d G {—2, —7, —11, —19}. Thus a binary gcd like algorithm is 
presented for a unique factorization domain which is not Euclidean (case 
d = —19). Together with the earlier known binary gcd like algorithms 
for the ring of integers in Q(V— 1) and Q(V— 3), one now has binary 
gcd like algorithms for all complex quadratic Euclidean domains. The 
running time of our algorithms is O(n^) in each ring. While there exists 
an 0(n^) algorithm for computing the gcd in quadratic number rings by 
Erich Kaltofen and Heinrich Rolletschek, it has large constants hidden 
under the big-oh notation and it is not practical for medium sized inputs. 
On the other hand our algorithms are quite fast and very simple to 
implement. 



1 Introduction 

Greatest common divisor is one of the most fundamental concepts of number the- 
ory. Elementary number theory texts introduce gcd very early and also present 
an algorithm to compute it, the Euclid’s algorithm. However, it is not possible 
to extend Euclid’s algorithm to all number rings. The rings in which one can 
extend Euclid’s algorithm are called Euclidean rings. A large amount of effort 
has been put in identifying Euclidean number rings. Franz Lemmermeyer’s pa- 
per on Euclidean number rings [12] contains an almost complete list of all known 
Euclidean number rings. 

In 1965 a different algorithm to compute gcd was presented by J. Stein [19]. 
Apart from being very simple to understand, this algorithm has the virtue of 
being efficiently implementable on a computer as the only operations used by the 
algorithm are addition, subtraction and division by 2. Since divisions by 2 can 
be performed by right shifts (on a computer), this algorithm essentially has no 
divisions at all. This algorithm is popularly known as the binary gcd algorithm. 
In this paper we present extensions of this algorithm to four complex quadratic 
rings. 

* Basic Research in Computer Science (www.brics.dk), funded by the Danish National 
Research Foundation. 

D.A. Buell (Ed.): ANTS 2004, LNCS 3076, pp. 57-71, 2004. 

© Springer- Verlag Berlin Heidelberg 2004 




58 



S. Agarwal and G.S. Frandsen 



1.1 Overview of Results 

We successfully generalize the binary gcd algorithm to compute the gcd in the 
ring of integers in Q^Vd) where d G {—2, —7, —11, —19}. In each case the time 
complexity of the algorithm is 0(ji^) with small constants hidden under the 
big-oh notation. The only operations used in our algorithms are addition, sub- 
traction and division by a small fixed number (2 or 3 or 5). One of the main re- 
sult is an extension of the binary gcd algorithm to a unique factorization domain 
(ufd) which is not Euclidean (case d = —19). Our extension clearly indicates 
that the binary gcd like algorithms are not restricted to Euclidean rings. 



1.2 Road Map 

Section 2 contains some preliminaries. In Sect. 3, we review some other algo- 
rithms for computing the gcd. The main idea of our algorithm is presented in 
Sect. 4. In Sect. 5, the algorithms for computing the gcd in the ring of integers 
in Q{Vd) are presented where d G {—2, —7, —11, —19}. 



2 Preliminaries 



The definitions/facts in this section are found in most books on algebra and/or 
algebraic number theory (for example see [10,6,7]). Complex quadratic number 
fields are of the form Q = Q{Vd) where d is a negative square-free rational 
integer. Any a G Q is of the form a + bVd where a, 5 G Q. For any a = a + bVd, 
the norm of a is defined as N(a) = aa where a = a — b^fd is the conjugate of 
a. If 2 denotes the ring of algebraic integers in Q, then 



Z = 



f Z-hZ 


Vd 




)z + z 




-1 + Vd) 



if d = 2, 3 (mod 4) 
if d = 1 (mod 4) 



There are nine complex quadratic rings which are also ufd. These are the ring 
of integers in Q(\/d) where d G {—1, —2, —3, —7, —11, —19, —43, —67, —163} [18]. 
From now on we will assume that Z is one of these nine rings. 

For all a G Z, N(a) is a non-negative rational integer and N(a) = 0 iff 
a = 0. An element m G Z is a unit iff N(u) = 1. Any two elements a,f3 G Z are 
called associates if a\(3 and (3\a. A non-zero non-unit element p G Z is a prime 
if {p\al3) (p|o; or p\(3). If p G Z is a prime, then there exists a rational prime 

p such that N{p) = p or p^. In the former case p is not associate to any rational 
prime and in the latter case p is an associate to p. The quotient ring Z/pZ is a 
finite field with N{p) elements. If N(p) = p for some odd rational prime p, then 
{ — ... ,0, . . . , forms a complete set of coset representatives for Z/pZ 
and if N{p) = 2, then {0,1} forms a complete set of coset representatives for 
Z/pZ. 

Let a,/3 G Z and a[3 yf 0. Then a non-zero element g G Z is said to be 
greatest common divisor (gcd) of a and (3 if 




Binary GCD Like Algorithms for Some Complex Quadratic Rings 



59 



a. g|of and g|/3, and 

b. for any 7 G Z\{0}, if 'y\a and 7|/3, then 7|g. 

For any a 0, gcd of a and 0 is defined to be a. In the literature, gcd of 
a and /3 is denoted by (a,/?) and we also use this notation. In general (a,/3) 
is not unique. However if gi = {a, (3) and g2 = {a, (3), then gi and g2 are 
associates. It is customary to overload the ’=’ operator for gcd. Thus a statement 
like (a,/3) = (7, ?7) means that gcd of a,/3 and gcd of 7,77 are associates. The 
following facts about gcd are easily shown. 

Lemma 1. Let a,f3 € Z be arbitrary integers and p € Z be any prime. 

a. If p\a and p\(3, then {a,(3) = p{a/ p,j3/ p). 
h. If p\a and p\ f3, then [a, (3) = [a/ p,j3). 

c. (a, !3) = {a + XP, P) for allX&Z. 

3 Related Work 

The main aim of this section is to present some of the algorithms which can 
possibly be used or extended to compute the gcd in different number rings. There 
are many different known ways of computing the gcd and the list of algorithms 
discussed in this section is not exhaustive. In the rest of this section the term 
’ring’ will always mean number ring. 



3.1 Euclidean Algorithm 

The problem of computing the gcd is as old as number theory itself. Euclid gave 
an algorithm to compute the gcd of rational integers in 300 B.C. [9]. This algo- 
rithm is called the Euclidean Algorithm (ea). However EA cannot be extended 
to all rings. The rings in which one can extend EA are the Euclidean rings. A 
fairly complete list of all known Euclidean rings can be found in [12]. 

Let R be any Euclidean ring. Then by definition of Euclidean ring, there 
exists (/?: i? !->■ N U {0} such that for any a,b £ R there exits q,r £ R such 
that a = bq + r and (p{r) < f{b). If r 0, then (a, 6) = (r, 6) and if r = 0, 
then (a, b) = b. A basic step in EA takes inputs a and b and finds such q and r. 
This step is called Euclidean division. EA repeats this step until r = 0 and then 
outputs the result. As 173 is a non-negative integer-valued function, EA will always 
terminate. The analysis of EA in Z is well studied [2,9]. The running time of EA 
in Z is 0{n^). One can establish that the time complexity of EA in imaginary 
Euclidean quadratic rings is 0(p(n)n) (see for example [14,12]) where p(n) is 
the complexity of multiplying two n-bit integers and p{n) = O(nlognloglogn) 
by [16]. A similar bound follows for some cyclotomic rings from the works of 
Hendrik Lenstra [13], Renate Scheidler and Hugh Williams [15]. 

There are two techniques of speeding up EA for large inputs. First technique 
is by D.H. Lehmer [11]. The basic idea behind this scheme is to perform single- 
precision arithmetic with leading digits of the input most of the time and fewer 
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multiple-precision operations. Andre Weilert [21] and George Collins [4] have 
established that a similar scheme on Gaussian integers has a time complexity of 
O(n^). Second technique is controlled Euclidean descent by Arnold Schonhage 
[16]. For the five imaginary quadratic Euclidean rings, Andre Weilert has estab- 
lished that this technique has a time complexity of 0{fi{n) logn) [22,20]. 

3.2 Non-euclidean Algorithms 

Erich Kaltofen and Heinrich Rolletschek [8] gave an 0{n^) algorithm for com- 
puting the gcd in all complex quadratic ufds. They transform the problem of 
computing the gcd to that of computing a short vector in a suitable four dimen- 
sional integer lattice. They have also given an O(n^) algorithm to compute the 
gcd in any quadratic ring. This algorithm is based on a result which states that, 
given a and /3 in some quadratic ring, one can always divide la hy (3 and have 
a remainder with norm smaller than N{j3) where I is a small rational integer. 
However the algorithm has large constants under the big-oh notation [8,21]. 

Henri Gohen has given a general algorithm for computing the extended gcd 
by reducing the problem to that of computing Hermite normal form of a suitable 
rational integer matrix [3]. However, he has not given the exact complexity of 
this algorithm in different rings. 

The simplicity of the operations used in the binary gcd algorithm makes it 
the method of choice on real computers [2] . This algorithm has been generalized 
by Andre Weilert [21] to the ring of integers in Q(\/— 1) and by Ivan Damgard 
and Gudmund Frandsen [5] to the ring of integers in Q(-y— 3). In both the cases 
the running time of the algorithm is 0{v?) with small constants hidden under 
the big-oh notation. There have been several variations and enhancements of the 
original binary gcd algorithm (see the notes at the end of chapter 4 in [2]). Our 
approach is quite similar to that of Jonathan Sorenson’s [17] k-axy algorithm. 
The main reason we look at the possibility of extending the binary gcd algorithm 
is its simplicity and its speed. While the approaches like controlled Euclidean 
descent are expected to be asymptotically faster, they are impractical for smaller 
inputs. 

4 Binary GCD Like Algorithms in Number Rings 

The binary gcd algorithm for Z is shown in Alg. 1. The algorithm is slightly 
modified to include negative integers. The algorithm is based on the following 
three facts: 

a. If 2|a and 2\b, then (a, h) = 2 (|, |). 

b. If 2|a and 2\h, then (a, b) = (|, &). 

c. If 2|a and 2|&, then (a, 6) = \^,b) = (^,6). 

Thus given two non-zero odd a,b G 1j with |a| > |6|, one can find c G Z in 
0(log |a|) time such that |c| < and (a, b) = (c, 6). This fact forms the basis of 
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Algorithm 1 Binary Gcd algorithm for Z (a, b are inputs) 


1 


Find i,j > 0 such that 2®|a, 2'+^ fa 


, 2^\b and 2J+i f 6 . 


2 


a = a/2®, b = b/2^ 




3 


Assert that |a| > |6|. Swap a and b 


if needed. 


4 


while true 




5 


if |a — fe| < |a + 6 | then c = 


a — b 


6 


else c = a + b 




7 


if c = 0 then break 




8 


Find h> 1 such that 2^|c 


and 2^+^ f c. 


9 


c = c/2^ 




10 


if |c| > |6| then a = c 




11 


else a = b, b = c 




12 


return 





the while loop in Alg. 1 and guarantees that the algorithm will terminate in at 
most (log |a||6| + 1) iterations of the while loop. This algorithm may not work 
for algebraic integers. This is because if a and /? are algebraic integers such that 
2 j a and 2 \ (D then 2 need not divide a + (3 or a — fi. Thus the argument for 
termination of the above algorithm fails for algebraic integers. 

Our aim is to create an algorithm similar to Alg. 1 for complex quadratic 
rings. As a first step we will generalize the binary gcd algorithm to use ratio- 
nal primes other than 2. This generalization can be seen as a special case of 
Sorenson’s fc-ary algorithm [17]. Suppose we have a,b € Z such that a and b are 
co-prime to an odd prime p, then p\{a + lb) for some I G Zp where Zp = Z/pZ 
is the finite field of residues modulo p. The set { — • ■ . ,0, . . . , forms a 
complete set of coset representatives for Zp and hence we can always choose I 
such that 1^1 < Therefore by using p in place of 2 in Alg. 1, we have a p-ary 
algorithm for calculation of gcd in Z as shown in Alg. 2. In Alg. 2 assuming 
|a| > l&l, it takes 0(log jaj) time to compute a suitable c in steps 5 and 6. Thus 

we can replace a with c such that (a, 6) = (c, 6) and jcj < -I- jaj and the 

complexity of Alg. 2 is the same as the complexity of the binary gcd algorithm. 

Let Z be an imaginary quadratic number ring. If p is a prime in Z, then 
T = Zj pZ is a field. Thus if a and j3 are any two integers co-prime to p, then 
there exists a A G iF such that a+\(3 is divisible by p. By Lemma 1 we know that 
(a, /3) = (a -I- A/3, (3). Thus by replacing { — . ■ • , 0, . . . , with T in step 
5 of Alg. 2 one can construct an algorithm similar to Alg. 2 for computing the 
gcd Z. If A^(a -I- A/3) < fN{pa) for some / < 1, then one can show that such an 
algorithm will terminate. However for a fixed choice of it is not guaranteed 
that there will exist a A G .F satisfying the above termination condition. For 
example consider the ring of integers in Q(-\/— 2). In this ring ( = is a 

prime of norm 2. If a and /3 are any two integers in this ring and co-prime to 
C, then a ± /3 is divisible by C. If 7 is the norm-wise smaller of {a + f3,a — /3}, 
then N{'-f) < 2N{a) (it follows using Lemma 2 which is mentioned later). Thus 
-^( 7/0 ^ N(a) and the above argument for termination breaks down. However 
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Algorithm 2 p-ary Gcd Algorithm for Z (p G Z is a fixed odd prime and inputs 
are a, 6 G Z) 

1 . Find i,j > 0 such that pfia, f a, p^\b and f b. 

2. a = a/p'^, b = b/p^ 

3. Assert that |a| > |6|. Swap a and b if needed. 

4. while true 

5. Find I e I ■ , 0, . . . , I such that (a + lb) = 0 (mod p) 

6. c = a + lb 

7. if c = 0 then break 

S. Find h>l such that p^|c and p'*+i \ c. 

9. c = c/p^ 

10. if |c| > |6| then a = c 

11 . else a = b, b = c 

12 . return 



one can get around this problem in some imaginary quadratic rings as follows 
(one can get around the termination problem in this particular ring in another 
way which is mentioned in conclusion). 

Suppose p is a rational prime. Then either p splits or ramifies or remains 
inert in a quadratic ring Z. Suppose p splits into p and p. In this situation p and 
p are not associates and are co-prime to each other. Since N{p) = N{p) = p, 
{ — ... ,0, . . . , or {0, 1} forms a complete set of coset representatives 
for both Z j pZ and Z jpZ depending on if p is odd or even. Now if a and (3 are 
any two integers co-prime to both p and p, then we have two choices of A (Ai 
and A 2 such that p|(a; -I- Ai/3) and p\{a + A 2 / 3 )). The idea is now to use both 
primes and then choose A which makes a -I- A/3 small. However this trick is useful 
only when there is at most one choice of A G .7^ which can result in a -|- A/3 
having a large norm. One can verify that this favorable situation occurs only 
when N{p) < 5. 

5 GCD Algorithms for Complex Quadratic Rings 

In this section we materialize the ideas presented in the last section and construct 
algorithms for computing the gcd in the ring of integers in Q = where 

d G {—2, —7, —11, —19}. These algorithms can be seen as an instance of the 
abstract algorithm shown in Alg. 3. The main difference from Alg. 2 is that we 
use a pair of conjugate primes instead of one fixed prime. Let p and p be the 
primes used in Alg. 3. The equivalent of c from Alg. 2 is C(a,/3) and is denoted 
by 7 . For different rings, p and C are different. 

In the rest of this section we will show how to choose p and C in different 
rings and prove the termination of the algorithm in each case. Note that if Alg. 3 
terminates, it will terminate with correct answer if {a, (3) = {C{a, (3), (3). In all 
the rings, our choice of C{a,(3) is a + 1(3 for some rational integer 1. Thus the 
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Algorithm 3 Gcd algorithm for Z (inputs are a, f3 and primes p, p are fixed) 

1. Let Pi = p, p2 = p 

2. Find ii, *2, ji and j2 such that p)*|o, t p{*\l3 and p)‘^^ t d for t = 1,2. 

3 . a = a/pilp 2 ^ /3 = d/pf Pi" 

4 . Assert that N{a) > N{P). Swap a and f 3 if needed. 

5 . While true 

6 . 7 = C{a, f}) 

7 . if 7 = 0 then break 

8 . Find (hi,h2) such that p(‘*|7 and |7 for t = 1 , 2 . 

9 . p = 7/P1V2" 

10. if Af(p) > N{P) then a = rj 

11. else a = f}, p = p 

12. return Pi P2 ^ 



correctness follows from Lemma 1. The following lemma will be the major tool 
in proving the termination results. 

Lemma 2. Let Z be an imaginary quadratic ring. Let a, P € Z and l,m € Z. 
If N (a) > N{P), then 

a. N{la + mP) < (|?| + |m|)^A(a), 

b. mm{N{la + mP),N{la — mP)} < {P + wP)N{a). 

Proof. We note that for any integer a, N(a) is the same as the complex norm of 
a when a is viewed as a complex number. Proof of above statements now follows 
by elementary properties of complex norm. □ 

For d G {—2,— 7, —11}, we show that tV(p) < fN{a) for some / < 1 and the 
termination of the algorithm is trivial. For d = —19, we show that in at most 
two iterations of the while loop, the product N{a)N{P) will decrease by a factor 
/ > 1 and hence the algorithm will terminate. The time complexity of Alg. 3 is 
the same in all rings and we discuss this in Sect. 5.5. In the rest of this section 
we will compare algebraic integers with respect to the norm. Thus whenever we 
say that a < P, it means that N{a) < N{P). Similarly, other concepts based on 
comparison like min and max are defined. 



5.1 GCD Algorithm for Z — Z [|(— 1 + v/— 19)] 

The algebraic integers in this ring are of the form a + buj where uj = 

and a,b G Z. Here uj and ui are non-associate primes of norm 5. It a, P G Z 

are co-prime to both to and to, then ui\{a + ip) and w|(a -I- mP) for some l,m G 
{-2,-1, 1,2}. Based on this observation, we construct a gcd algorithm in this 
ring by taking {p, p} = {uj, w} and defining C{a, P) as follows. If a+P is divisible 
by w or (D then return a + p. If a — /3 is divisible by w or then return a — p. 
Otherwise, let A = minjo; -I- 2/3, a — 2/3} and A = max{o; -I- 2/3, a — 2/3}. If A 

is divisible by ui or ui, then return A. Else return yl with the assurance that it 
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is divisible by U) 0 J. Showing that such an algorithm terminates is slightly non- 
trivial. In the next two lemma we show that either after one iteration or after 
two iterations of the while loop, the product of the norms of the two integers 
decreases by a factor of at least 1.16. Thus the algorithm will terminate in at 
most 2(logi 15 N{a)N{(3)) + 1) iterations of the while loop. 



Lemma 3. Let a,P € Z be the algebraic integers at the start of the while loop 
in Alg. 3 with {p,p} = {w,w} and C{a,f3) as described above. If C{a,!3) yf 
min{a -|- 2/3, a — 2/3}, then after one iteration of the while loop, the quantity 
N{a)N{(3) decreases by a factor of at least |. 

Proof. Without loss of generality assume that N{a) > iV(/3) and min{a-|-2/3, a — 
2/3} = a — 2/3. If 7 = C{a, /3) minjo; -|- 2/3, a — 2/3}, then either 7 = a ± /3 or 
7 = a -I- 2/3. 

Suppose 7 = a±/3. Then from Lemma 2, N{j) < (1 -|- l)^A^(a) < 4N{a). By 
the description of 7 = C(a, /3), note that 7 is divisible by w or d). Thus rj = — ^ 
where i and j are non-negative rational integers with i + j > 1. As i + j > 1, 
N{uj^u)^) > 5 and hence, 



N{'q) < 



Nil) 

/V(w*d)3) 



< 






Since p will replace a, N{a)N{f}) will decrease by a factor of at least | in this 
case. 

Now suppose 7 = 0 - 1 - 2/3. Again using Lemma 2, N{i) < (1 -I- 2)^iV(a) < 
%N{a). From the description of C{a, (3), note that oju)\i. Thus p = for some 
rational integers i,j > 1. Thus, 



N{p) < 



Nil) 

Niu)'‘U)t) 



< 






Hence the product of norms will decrease by a factor of at-least ^ > | in this 
case. □ 



Lemma 4. Let a, (3 € Z be the algebraic integers at the start of the while loop in 
Alg. 3 with {p, p} = {to, ui} and C(a, /3) as described above. IfCia, /3) = min{o;-|- 
2 / 3,0 — 2/3}, then A^(o)A^(/3) never increases and either after one iteration or 
after two iterations of the while loop, the quantity iV(o)iV(/3) decreases by a 
factor of at least 1.16. 

Proof. From the description of C(o, /3) we know that 7 = C(o, /3) is divisible by 
uj or uj. If 7 is divisible by both lo and uj, or 7 is divisible by C* where f = lo 
or f = dj and i > 2, then on the lines of proof of Lemma 3, it follows that the 
product of norms will decrease by a factor of at least 5. Thus we need to consider 
only the case where 7 is divisible by either to or uj and exactly once. Let 7 be 
divisible by C where C, = oj or oj. 
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A complex quadratic integer can also be viewed as a point on the complex 
plane. On the complex plane, N(a) is same as the square of distance of a from 
origin z.e, N{a) = |ap. It is an elementary result in geometry that 

\a + 2/3p = |ap + |2/3p + 2|a||2/3| cos6 and 
ja — 2/3p = jap + \2/3\^ — 2|a||2/3| cosO 



where 9 is the angle between a and f3. Without loss of generality we assume 
that 0 < 0 < Then N{a — 2/3) < iV(a + 2/3) and the algorithm will choose 
"f = a — 2/3. Also assume that |a| > |/3|. Thus |a| = k\(3\ where k gR and fc > 1. 
Thus N{a) = k'^N{P) > N{P) and hence 

A^(y) = |a — 2/3p = + 4 — 4/ccos0)|/3p . 



As 7? = 7 /C, 

^ ^ ^ fc^ + 4-4fccosl3 , 

^ N{C) 5 ■ 

Since the algorithm will replace a by rj, the product of norms will change by a 
factor /, where 



A^(t?) ^ Nitj) 4 4cos6l 

^ N{a) k^N{P) 5 5fc2 5fc ' ^ 

As fc > 1 and 0<6*< f,/<l and hence the algorithm will not increase the 
product of norms by replacing a by rj. However, as / can get arbitrarily close to 
1, we are not assured that there will be substantial decrease in the product of 
norms. However intuitively we can see that if 9 is away from then / will be 
small. The idea of the proof is to show that when 0 « then the angle between 
rj and /3 is away from Thus while there may not be substantial decrease in 
the product of norms in this iteration, there will be enough decrease in the next 
iteration. Concretely, we now show that the product of norms will decrease by a 
factor of 1.16 as long as k > 1.1 or 0 < 0.4447T. Then we show that if 1 < A: < 1.1 
and 0.4447T < 0 < f , then the acute angle between rj and (3 (if the angle between 
rj and /3 is obtuse then angle between r] and — /3 is acute) is less than 0.4447T. Thus 
in the next iteration either by Lemma 3 or by the arguments in the previous 
lines, the product of norms will decrease by a factor of at least 1.16. 

Case 1 : Suppose that k > 1.1. Then from (5.1) 



/< 



1 

5 



4 

5* 1.21 



< 0.862 . 



On the other hand if 6* < 0.4447T. Then cos 6* > 0.175 and hence from (5.1), 



/< 



1 

5 



4 



4(0.175) . 



The expression on right takes maximum value at fc = 1 (in the range 1 < fc < oo 
as fc < 1 is not possible). Evaluating, / < 0.86. Thus whenever k > 1.1 or 
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0 < 0 < 0.4447T, the product of norms will decrease by a factor of at least 



Case 2 : Now let 0.4447T < 0 < | and 1 < fc < 1.1. Since k < 1.1, |a| = k\(3\ < 
|2/3|. The situation is something like as shown in Fig. 1. In the figure, the dashed 
cone indicates the region in which all integers making an angle (acute) greater 
than 0.4447T with ±/3 can lie. 



On the complex plane, uj = and uj = ^ 2 ~ 

where ojg = tt — tan“^ vTQ- Thus after dividing 7 by w or w we get rj, 
which is 7 reduced in length by \/5 and rotated by angle ±ujg. Now we wish to 
show that 77 is outside the dashed cone. 




Fig. 1. A typical situation when 1 < fc < 1.1 and 0.4447T < S < f • The dashed conical 
region contains points which can make an acute angle greater than 0.4447t with ±/3. 
a lies in this cone and inside the dotted region which is got by placing requirement 
l/3| < |f^l < 1-1|/3|- Once 7 = C{a,j3) = a — 2/3 is divided by uj or O, the resulting point 
gets reduced in length by %/5 and rotated by ±a>e where = tt — tan“^ -\/l9. The 
lengths of line segment 7D and (—2/3)1? are |a| sinS and |q| cosO respectively, which 
are got by completing the parallelogram Oay)— 2/3). 



To show that rj lies outside the dashed the cone, we have to find out the 
extremal values of 9d subject to 1 < 3c < 1.1 and 0.4447T < 0 < |. From the 
figure. 



|a|sin0 3csin0 

2|/3| — |a| COS0 2 — 3c cos 0 



tan 6d 
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Since kcosd <2, 0^ is always acute. Analytically or graphically one can verify 
that in the region under consideration 9d attains its maximum value at k = 1.1 
and 9 = 0.4447T and attains its minimum value at fc = 1 and 9 = ^. Evaluating we 

get, = tan-i (i) > 0.147TT and = tan"! < 0-172TT. 

Thus we have, 



0.147TT <9d< 0.1727T . 

Let 4> be the angle between rj and (3. If it lies inside the dashed cone, then we 
should have 0.4447T < (f> < 0.5567T or — 0.5567T < (p < — 0.4447T. As mentioned 
already, after division by uj or id, ^ will be rotated by iwg = ±(7 t — tan“^ ■\/l9). 
By direct computation, 0.5717T < uig < 0.5727T. Now suppose that 7 is rotated 
by utg. Then using above extremal values of 9d and ujg one sees that 0.7187T < 
(j) < 0.7447T. Thus in this case 7 cannot lie in the cone. Similarly if 7 is rotated 
by —L 09 , then — 0.4257T < (f> < — 0.3997T. Thus again 7 cannot lie in the cone. 

Thus in two iterations the product of norms will decrease by a factor of 
1.16. We wish to note here that by a similar reasoning one can verify that if 
d = {a + 2/3) /C where C = w or ^ = w, then 9 also lies outside the forbidden 
cone. While this is not crucial to the proof, it becomes crucial later when we use 
approximate norms. □ 

5.2 GCD Algorithm for Z — 1 [|(— 1 + v/~TT)] 

The integers in this ring are of form a = a + huj where uj = and 

a,b G Z. Here to and a) are non-associate primes of norm 3. Since there are no 
integers of norm 2 in this ring, uj and d) are the smallest non-rational prime. As 
{ — 1,0, 1} forms a complete set of coset representatives for ZjioZ and ZjidZ, if 
a,(3 & Z are co-prime to both to and w, then uj divides one of {a + !3,a — (3} and 
w divides one of {a + (3, a — (3}. Based on this observation, we construct a gcd 
algorithm in this ring by taking {p,p\ = {w,d)} and defining C{a,j3) as follows. 
Let A = min{a + (3,a — j3} and A = max{o; -|- f3,a — f3}. If A is divisible by either 
UJ or uj, we pick A. Else we pick A and we now know that A is divisible by both 
UJ and id. The following lemma shows that the product of norms decreases by a 
factor of at least | in each iteration of the while loop and hence the algorithm 
will terminate in at most (logs N{af3) + 1) iterations of the while loop. 

Lemma 5. Let a,j3 & Z he the algebraic integers at the start of the while loop 
in Alg. 3 with {p,p} = |w,w} and taking C{a,(3) as described above. Then in 
each iteration of the while loop, the quantity N{a)N{f3) decreases by a factor of 
at least |. 

Proof is similar to the proof of Lemma 3. 

5.3 GCD Algorithm for Z — h [|(— 1 + 

The integers in this ring are of form a = a + buj where uu = and a,b 

Here uj and id are non-associate primes of norm 2. li a, (3 £ Z are co-prime to 
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both oj and a), then ojoj\{a + /?) and uju]\{a — (3). Based on this observation, we 
choose {p, p} = {oj, Lo} and C{a, (3) = min{o; + (3, a — (3} for Alg. 3 for this ring. 
Next lemma shows that the product of norms decreases by a factor of 2 in each 
iteration of the while loop and hence the algorithm will terminate in at most 
(log A^(q;/ 3) + 1) iterations of the while loop. 

Lemma 6. Let a,P € Z be the algebraic integers at the start of the while loop 
in Alg. 3 with {p, p} = {uj, w} and C{a, (3) = min{a + P, a — P}. Then in each 
iteration of the while loop, the quantity N{a)N{P) decreases by a factor of at 
least 2. 

Proof is similar to the proof of Lemma 3. 



5.4 GCD Algorithm for Z — Z [V— 2] 

The integers in this ring are of form a = a + b\f—2 where a,b G Z. In this ring 
p = 1 + -\/— 2 and p = 1 — -\/— 2 are non-associate primes of norm 3. The gcd 
algorithm in this ring is Alg. 3 with choices of p and p as above, and C{a,P) is 
the same as in the case of d = — 11. The fact that the algorithm terminates also 
follows from a lemma similar to Lemma 5. 



5.5 Runtime 

Let s(x) denote number of bits required to represent x. Let a = a + bu> (where 
OJ = for d G {—7, —11,— 19} and to = for d = —2). For any given 

ring UJ is constant. Hence we represent input as a pair (a, b) where a,b G Z. Let 
a = a+buj and P = c+du be inputs to Alg. 3. If 2”“^ < max{|a|, |6|, |c|, |d|| < 2”, 
then s(q;) < 2n -|- 0(1) and s{P) < 2n + 0(1). The main focus of this section 
is to show that the time complexity of Alg. 3 is O(n^) in all the rings, with 
small constants hidden under the big-oh notation. The following lemma is easily 
shown. 

Lemma 7. If a G Z, then s(a) < s(A(o;)) < 2s(a) -I- 0(1). 

First consider the steps 2-3 of Alg. 3. Divisions are by a fixed prime and 
easily performed. For d G {—7, —11, —19}, ^ = b — ^ — ^OJ and f = — & + 
where p = N{oj) = 2 or 3 or 5 in the respective ring. For d = —2, = 

and -I- Thus each single division step 

takes at most 0{n) time. Since every single division decreases the product of 
norms of a and /3 by a factor of p (where p = 2 or 3 or 5), if a total of r divisions 
are performed, then using Lemma 7 it follows that r = 0(n). Thus steps 2-3 
take at most O(n^) time. As the while loop contains norm computations, the 
complexity of step 4 can be ignored. The following lemma will be useful before 
we consider the while loop in Alg. 3. 
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Lemma 8. Let a,P € Z be the algebraic integers at the start of the while loop 
in Alg. 3. Then the algorithm halts in at most k ■ (s(a) + s(/3)) + 0(1) iterations 
of the while loop for some positive real number k. 

Proof. Let Z = Z [|(— 1 + V— 7)] . Then from Lemma 3 we know that in each 
iteration of the while loop the quantity N{a(3) decreases by a factor of 2. Thus 
the algorithm will halt in at most (log A^(a/3) + 1) iterations of the while loop. 
But, 



log A^(a/3) + 1 < s{N{a)) + s{N{(3)) + 1 < 2(s(a) + s(/3)) + 0(1) . 

Similarly we can show this in every ring. The exact values of constant k are 
shown in the fourth column of Table 1. □ 

From Lemma 8 it follows that the number of iterations of the while loop in the 
Alg. 3 are 0(n) for every ring. In each iteration of the while loop the operations 
performed are addition, subtraction, division by 2 or 3 or 5 and computation of 
norm. Addition and subtraction can be done in 0{n) time. A single divisibility 
test and a single division step can also be done in 0(n) time. There are now two 
issues. First, the number of divisions performed in each iteration and second, 
the computation of norm. Let us assume that there are ti divisions in the 
iteration and the norm can be computed in T(n) time. Since there are 0(n) 
iterations of the while loop, the combined time complexity of all the iterations 
of the while loop is 0{n{n + T{n) + Siti)). 

We will now estimate Siti. We know that product of norms will decrease 
by a factor of at least in each iteration where p = 2 or 3 or 5. Hence 

< N{a)N{P). Since there are only 0{n) number of iterations, OiP** — 
p'^(”)A^(a/3) and hence Siti = 0(n). Thus the combined time complexity of all 
the iterations of the while loop is 0(n^ + nT(n)). Since startup to while loop 
takes O(n^) time, the time complexity of Alg. 3 is 0(n^ + nT(n)). 

Let us now concentrate on T{n). Norm computation involves a fixed number 
of multiplications and additions. Hence T{n) is of the order of multiplication of 
two n bit numbers. However it is not necessary to compute the norm exactly. The 
algorithm uses the norm to compare integers. When norms are very close, it does 
not make much difference which one the algorithm picks. If the algorithm uses 
approximate norms to compare integers, then it may have to do more iterations 
to get the gcd, but the number of iterations still remain 0(n). For details we refer 
to the earlier known binary gcd like algorithms for ring of integers in Q(-v/~l) 
[21] and Q(-\/— 3) [5]. The approximate norm can be computed in 0{n) time [5] 
and hence the complexity of Alg. 3 becomes O(n^). 

The fact that the approximate norm will terminate the algorithm is quite 
easy to show for the cases d G {—2, —7, —11}. For d = —19 one needs the fact 
mentioned at the end of the proof of Lemma 4. The effect of approximation on the 
number of iterations of the while loop in Alg. 3 is shown in Table 1. In Table 1, 
the second column is the factor by which the product of norms would decrease 
if we computed the norm accurately. The third column lists the factor by which 
product of norms would decrease if we compute an approximate norm along the 
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lines of lemma 1 in [5] by using only 16 most significant bits in multiplications. 
The number of iterations performed is quantified by the constant k of Lemma 8. 
The fourth column lists the value this constant when the norm is computed 
accurately. The revised values of this constant with approximate norm are listed 
in the last column. As a conclusion to this section we have the following theorem 

Theorem 1. The running time of Alg. 3 is 0{n^) where n is the number of bits 
in the representation of input and where norm is computed approximately. 



Table 1. 



Ring f r k k' 

Z 1.5 1.49 3.42 3.48 

Z[l(-l + v^)] 2 1.99 2 2.02 

Z[|(-l + V^)] L5 1.49 3.42 3.48 

Z [1(-1 + 1-16 1.15 18.7 19.84 



6 Conclusion 

We have presented binary gcd like algorithms for computing the gcd for four com- 
plex quadratic rings. The main contribution of the paper is that this approach 
can be made to work for unique factorization domains which are not Euclidean. 
Our algorithms are quite simple to implement and are practical in use. Its an 
open question if it is possible to extend these algorithms to the remaining three 
imaginary quadratic unique factorization domains. 

The two step analysis used for Z [^(—1 + \/— 19)] is not restricted to conju- 
gate primes. Using similar analysis one can show that algorithms similar to Alg. 3 
can be constructed for Z [V— ^ using the prime V— 2 and for Z [i(— 1 -|- -\/^)] 
using the prime 2. More details are available in [1]. 
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Abstract. Given an algebraic number field K, such that \K : Q] is 
constant, we show that the problem of computing the units group 0*k is 
in the complexity class SPP. As a consequence, we show that principal 
ideal testing for an ideal in Ok is in SPP. Furthermore, assuming the 
CRH, the class number of K, and a presentation for the class group of 
K can also be computed in SPP. A corollary of our result is that solving 
PELL'S EQUATION, recently shown by Hallgren [12] to have a quantum 
polynomial-time algorithm, is also in SPP. 



1 Introduction 

The computation of units in a number field is a fundamental problem in compu- 
tational number theory and is considered an important algorithmic task in the 
area. It has been the subject of considerable research in the last two decades and 
several algorithmic results as well some complexity-theoretic results have been 
pioneered. Much of this research (e.g. [17,7,6]) is based on ideas developed by 
Buchmann in [3,4]. 

In the present paper we are interested in the following problems in compu- 
tational number theory, from a structural complexity perspective. Let AT be a 
number field given by its minimal polynomial. 

1. Computing a fundamental system of units that generates the units group 
0*K in Ok- 

2. Computing a presentation (i.e. a set of generators and relators) for the class 
group Cl{K) of K and the class number h{K). 

3. Testing if a given ideal A of Ok is a principal ideal. 

From a purely complexity theory perspective, earlier research on these prob- 
lems was by McCurley [16], and Buchmann and Williams [7]. This was followed 
by the Thiel’s work [17] where it is shown that the problem of principal ideal 
testing is in NP. Furthermore, assuming the Generalized Riemann Hypothesis, it 
is shown in [17] that principal ideal testing and verifying the class number are 
in NP n coNP . 

Our interest to further investigate the computational complexity of these 
problems is motivated by the recent exciting work of Hallgren [12] where it 

D.A. Buell (Ed.): ANTS 2004, LNCS 3076, pp. 72-86, 2004. 
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is shown that computing a solution to PELL'S EQUATION is in BQP (the 
class of problems that have polynomial-time quantum algorithms). Hallgren’s 
main result is that given a quadratic number field K, its regulator Rk can 
be computed by a polynomial-time quantum algorithm. The regulator is the 
solution to PELL'S EQUATION. For quadratic fields, Hallgren also indicates 
how principal ideal testing and computing the class group are problems in BQP. 
Hallgren’s results, however, do not appear to generalize to number fields of larger 
degree. Thus, it remains an open problem if these problems are in BQP for 
number fields of degree more than two. 

How does the class BQP relate to standard complexity classes defined using 
classical Turing machines? Fortnow and Rogers [11] show that BQP is contained 
in the counting complexity class AWPP (definitions in Section 1.1). Thus, in 
a sense, we can think of BQP as a counting class. Counting classes is an area 
of research in structural complexity theory motivated by Valiant’s class #P 
(see e.g. [10]). Intuitively, counting complexity classes are defined by suitable 
restrictions on the number of accepting and rejecting paths in nondeterministic 
Turing machines. In the rest of this section we give formal definitions followed 
by a summary of our results. 

1.1 SPP and Other Counting Complexity Classes 

Let S = {0,1} be the finite alphabet. Let Ig denote logarithm to base 2. Let FP 
denote the class of polynomial-time computable functions and NP denotes all 
languages accepted by polynomial-time nondeterministic Turing machines. Let 
Z denote integers. A function / : A* — >■ Z is said to be gap-definahle if there 
is an NP machine M (i.e. a nondeterministic polynomial time Turing machine 
M) such that, for each x G S* , f{x) is the difference between the number of 
accepting paths and the number of rejecting paths of M on input x. Let GapP 
denote the class of gap-definable functions [10]. For each NP machine M let 
gapjv^ denote the GapP function defined by it. 

A language L is in UP if there is an NP machine M accepting L such that 
M has at most one accepting path on any input. The class UP was defined by 
Valiant and it captures the complexity of 1-way functions. The complexity class 
SPP is defined as follows. A language L is in SPP if there is an NP machine M 
such that X G L implies gap^(a;) = 1 and x ^ L implies gap^(x) = 0. In this 
case we say that L is accepted by the machine M. Note that the class SPP is 
essentially a GapP analogue of the class UP and UP C SPP. 

We say that / is in GapP"^, for oracle A G A*, if there is an NP"^ machine 
such that, for each x G A*, f(x) is the difference between the number of 
accepting paths and the number of rejecting paths of on input x. For an 
oracle A, we can now define the class SPP'^. 

The class PP is defined as follows: a language L is in PP if there is an 
/ G GapP such that x G L if and only if f(x) > 0. PP is a hard counting class: 
by Toda’s theorem we know that PH C P^^. We say that a language A G A* is 
low for PP if PP'^ = PP. Gharacterizing the class of languages low for PP is an 
intriguing open question in structural complexity. 
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In [10] it is shown that every language in SPP is low for PP. Additionally, 
SPP has nice closure properties [10]: P®pp = SPP®^^ = SPP. Another class 
that is low for PP [14] is BPP (the class of languages with polynomial-time ran- 
domized algorithms with error probability bounded by, say, 1/3.) Subsequently, 
the complexity class AWPP was introduced^ in [9] . The class AWPP generalizes 
both BPP and SPP, and it is shown that every language in AWPP is low for 
PP. To complete the picture relating these classes, Fortnow and Rogers in [11] 
show that BQP is contained in AWPP. It is interesting to note that NP fl coNP 
is not known to be low for PP. Here is a diagram that shows the containments 
between the complexity classes discussed here. 

AWPP 



SPP BQP 



UP BPP 



P 

Although no containment is known between BQP and SPP, it is interesting 
to compare these classes in terms of natural problems they contain. Important 
problems known to be in SPP are Graph Isomorphism and the hidden subgroup 
problem for permutation groups [1]. These problems have resisted efficient de- 
terministic or randomized algorithms, but are considered potential candidates 
for quantum algorithms. On the other hand, FP®^^ contains Integer Factoring 
and Discrete Log that have polynomial-time quantum algorithms.^ 







1.2 The New Results and the Methods 

We now state the main results of the paper. 

(a) Given a number field K (by its minimal polynomial as input), the problem 
of computing a fundamental system of units is in FP®^^, assuming that K 
is a constant degree extension of Q. As a consequence finding the regulator 
of K upto polynomially many bits of approximation is also in FP®^^. As a 
corollary the PELL'S EQUATION problem is in FP®^^. 

(b) Given a constant-degree number field K and an ideal A of the ring Ok, 
testing if A is a principal ideal is in SPP. 

^ For the definition see [9]. 

^ In fact, these problems are even in Also, as = SPP, notice that the 

class FP®^^ is essentially SPP: for / G FP®^^ and inpnt x, the bits of f{x) can be 
compnted in SPP. A similar closure property holds for BQP. 
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(c) Given a constant-degree number field K (by its minimal polynomial as in- 
put), the problem of computing the class group of K (by finding a generator- 
relator presentation for it) and finding the class number of K is in 
assuming GRH. 

In particular, PELL'S EQUATION is also in FP®^^. Thus, we add to the 
list of natural problems that are in both SPP and BQP. A brief outline of the 
methods used to show the above results is given below. 

Let M be an oracle Turing machine. For a language A in NP, we say that 
makes UP-like queries to A if there is an NP machine N accepting A such 
that on all inputs x, M^{x) makes only such queries y for which N{y) has at 
most one accepting path. Effectively, it is like M having access to a UP oracle. 
We state a useful variant of a result from [15]. 

Theorem 1 ([15]). Let M he a nondeterministic polynomial-time oracle ma- 
chine with oracle A G NP such that makes UP-like queries to A then the 
function h{x) = gap^^(x) is in GapP. 

Next, we recall an important property of the class SPP shown in [10]. 

Theorem 2 ([10]). If L is in SPP"^ for some oracle A G SPP then L G SPP. 
I.e. SPP®PP = SPP. 

The following lemma, which is a straightforward consequence of Theorem 1 
and of Theorem 2, is in a form useful for this paper. 

Lemma 1. 

— Suppose L is in SPP"^ accepted by the nondeterministic polynomial-time ora- 
cle machine with oracle A G NP (i.e. x £ L implies that gsrp j^^a{x) = 1, 
and X ^ L implies that ga,p j^a{x) = 0), such that the machine makes 
UP-like queries to A, then L is in SPP. 

— Suppose a function f : S* ^ S* is in FP'^ (i.e. f is computed by a 
polynomial-time oracle transducer M^) where A G NP, such that the ma- 
chine makes UP-like queries to A, then f is in FP®^^. 

Lemma 1 is a crucial tool in obtaining the FP®^^ upper bounds. For comput- 
ing a fundamental system of units in FP®^^ we first show that a bound B £ Q 
can be computed in FP®^^ such that the regulator Rk of K lies between B 
and 2B. Once such a bound is computed, we again apply an algorithm based on 
Lemma 1 to compute a canonical fundamental system of units in FP®^^. This 
notion of canonical fundamental system of units is developed and explained in 
Sections 2 and 3, where we show how to transform an arbitrary fundamental 
system of units to the canonical set. 

Once we have the FP®^^ upper bound for computing fundamental units, 
we can design an SPP algorithm for principal ideal testing. If we assume the 
generalized Riemann hypothesis then, by a result of Bach [2], we can apply 
our SPP algorithm for principal ideal testing and give an FP®^^ algorithm for 
computing the class group Cl{K). 
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1.3 Comparison with Previous Results 

As mentioned, Thiel [17] has shown that principal ideal testing is in NP. Thiel 
also shows, assuming the GRH, that principal ideal testing and verifying class 
number are in NP fl coNP. On the other hand, our results on principal ideal 
testing and computing a fundamental system of units are unconditional, but 
applicable to only number fields of constant degree. The FP®^^ upper bound for 
the class group problem depends on the GRH. 

No containment relation is known between SPP and NP fl coNP or BQP and 
NP n coNP. Furthermore, we remark here that NP fl coNP is not known to be 
low for PP. Thus, the results of Thiel [17] are incomparable to our results. 

An important computational aspect in all our results is the notion of compact 
representation as explained by Thiel [17], based on Buchmann’s earlier papers 
[3,4]. We need compact representations to succinctly express units as well as the 
generating element of a principal ideal in Ok- 



2 Compact Representation 



Let K he & number field of degree n and let O be the ring of integer of K. 
Let D be the discriminant of K. For an element a G AT by N (a) we mean 
the norm Nq (a) . Without loss of generality we assume that the input to the 
algorithm is O presented as a Z- module with basis wi, . . . and constants 
such that ijJiUjj = '^kCijkUJk- For, computing the maximal order from a given 
order is reducible to the problem of finding the square free part of an integer 
which can be done in FP®^^, as factoring integers is in FP®^^. By size of O we 
mean ^ size(cijk)- The constants Cijk will be the called the explicit data for K. 

Fractional ideals a of O will be presented by giving a Z-basis for a. Let 
1 < * < "fl, be a basis of a then by HNF(a) we mean the 
Hermite normal form of the matrix (a^). Once coi’s are fixed, for every ideal a, 
HNF(a) is unique. Since the Hermite normal form of a matrix can be computed 
in polynomial time this gives a polynomial time algorithm for testing whether 
two ideals are equal. 

Let (Ti, . . . , (Tr- be all the r real embeddings and ct^+i, (Tr+i, ■ • ■ , crr+s,tJr+s be 
all the 2s complex embeddings of K. Define the r + s absolute values on K as 
follows. 



f ]cri(a)j if 1 < i < r 
\ ]CTi(a)j^ ifr + 1 <f<r + s 



For a G AT, by height of a, denoted by H(a), we mean maxjjaj^ : 1 < * < 
r + s}. 



Lemma 2. GivenO, we havehi{u;i) < andlgD < n{2lgn+size{0)). 



Proof. Let I be such that for 1 < i < n H(o;;) > H(wj). Then we have 



hlioom) < Y, |c«fc|H(wfc) < n2*“^(°)H(a;z). 

k 
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Hence H(w;) < Also since D = Or^i have D < 

Fix a basis for O. For any a & O there is a unique set of integers ai, . . . , a„ 
such that a = ^ aiUii. By giving the vector of integers a^’s the algebraic integer 
a is completely specified. Following [17] this is the standard representation of a 
which is unique for a fixed Z-basis for O. The size of a in standard representation 
is sizeg(a) = size(ai). 

The following result from [17] describes a compact representation of algebraic 
integers. 



Theorem 3. For a G O there exists k < lg(lg(H) + (n — l)lgH(a)) + 2 and 
"f,ai G O and di G I < i < k, with H ( 7 ) < N H (a^) < £) and 

0 < di < '/D such that 



a = 




Moreover, for I < j < k the ideal rii=i 



O is a reduced ideal. 



A product of this form can be presented as a tuple (A:, 7 , {cti,di)^^f). For a 
given compact representation of a the size of the representation is the sum of 
the sizes of the integers di and the sizes of algebraic integers 7 and afs in their 
standard representation. Compact representations are not unique for a given 
a even for a given Z-basis. Let sizec{a) denotes the maximum of sizes of all 
compact representation of a. Using Lemma 2 and [17, Corollary 15] we have the 
following theorem on compact representations. 



Theorem 4. [17] For nonzero a G O 

sizcs(a) < (nlgH (a) • size(O))^^^^ . 

siz6c(a) < (n^lg^(n) • size(O). Ig (size(O)) ■ N (a) • IglgH(a))'^^^^ 



Furthermore, given the compact representation {k,j, of a, there is a 

polynomial time algorithm that computes the Flermite Normal Form for the ideal 
aO. 

Conversely the following proposition from [17] gives a bound on the height 
of the algebraic number based on the size of their representation. 

Proposition 1. Let a be any algebraic integer of K, a number field of discrim- 
inant D. Then we have: 

1. For all j, H(a) < rj 22 size,(a)-esi^e( 0 ) ^ 

2. For all j. In H (a) < In N (a) -|- Ig n • sizeda) ■ size(O) ■ . 
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3 Minimal Bases for Lattices 

For a set of linearly independent vectors ai, a 2 , . . . a„ let a*, a^, . . . , a* denote 
the corresponding GSO (Gram-Schmidt) basis. Given a lattice A = in 

K” with basis ^ ~ b® the matrix that transforms 

the GSO basis given by b*’s to the basis given by the b^’s. We say that the basis 
is proper if for every i < j < n we have — ^ < fiij < ^ . The following holds for 
any lattice. 

Lemma 3. Given a lattice A with basis ai, a 2 , . . . , a„, a new basis bi, b 2 , . . . b„ 
can be computed in polynomial time such that hi ’s form a proper basis and 
b* = a* (i.e. the GSO basis of both vectors are the same). 

Proof. Here is the algorithm. 

bi := ai; 

1 for i = 2 to n do 

Let ai = a* + 

bi . ai , 

2 for j = i — 1 downto 1 do 

if pLij > I V /iij < — I then 

Let n be the nearest integer to pij; 

3 bi bi — nhj\ 

end 

end 

end 



The invariant for the loop in step 1 is — | < fikj < | for all 1 < j < /c < i — 1. 
If the invariant is violated at i for some j then the loop in step 2 fixes it. For a 
given k note that step 3 does not affect any of the piij for j > k. It is also clear 
that step 3 does not affect the GSO of the basis. 

Given the vector space W = U (B V such that U and V are orthogonal, for 
w G W, if w = u + v, u G U and v G V, then w/U denotes the vector v (i.e. the 
component of w orthogonal to the space U). For a lattice A, A/V is the lattice 
{v/V : V G A}. If bi, b 2 . . . b„ forms a basis for A then any vector of AjV can 
be expressed as an integer linear combination of bi/H’s. 

Given a lattice A, a basis bi,b 2 ,...,b„ is called a minimal basis if it is 
proper and it satisfies the following conditions: 

1. bi is a fi-shortest vector in A. 

2. For all f, if I^_i is the span of the vectors bi,b 2 , . . . ,bi_i then hijVi-i is 

the vector of least norm in the lattice AjVi_x. 

To find a canonical basis for the lattice A one can define a total order on the 
set of all minimal basis of A and choose the least basis under that order. For two 
vectors u = X] and v = ^ v if || u ||^ < || v ||^ or if || u ||^ = |1 v ||j^ 

then there is an 1 < t < n such that Uj = Vj for all 1 < j < f and Ui < Vi. 
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Consider two minimal basis A = and B = {hi}2^^ for the lattice A. 

Let A* = and B* = {b*}”^^ be their GSO basis respectively. For two 

such minimal basis A ^ B A there is an i such that for all j < i, aj = hj and 

a* ^ b*. 



Theorem 5. On the set of minimal bases, the relation -< forms a total order. 

Proof. Suppose ^ does not form a total order then we have two minimal basis 
A and B and an index i such that aj = hj for all j < i and a* = b* yet a^ 7 ^ b^ . 
Expressing a^ and b^ in the respective GSO basis we have 

i i 

= ^ Pjh* . 

i=i i=i 

Let k be the index such that for all j > k, aj = fdj and ak ^ Pk ■ Clearly 
k < i. Since the basis A and B are proper we have aj and Pj lie in the interval 
I). Consider the vector u = a, — b^. Since a* = b* for all 1 < j < i we 
have u = (a^ — /3fc)aJ + u' where u' lies in the vector space Vfc_i that is spanned 
by But then ||u/Vfc-i|li = “ /^fc| II ak* |li < || ak/Vfc-i ||;i which 

contradicts the fact that A and B are minimal. 

Given an algorithm for finding the ^-least vector of a lattice, Theorem 5 suggests 
the following algorithm for finding ^-minimum basis for a lattice A. 

Input: A set of linearly independent vectors bi, b2, . . . , b„. 

Output: The ^-minimum basis for A. 

Let A be the lattice Z[bi, b2, . . . , b„]; 

Find the ^-least element u of the lattice A\ 

Let be a basis for A such that ui = u; 

Let Ufc = Ufc - 1 < fc < n; 

Find the ^-minimum basis for the lattice A* generated by {u1}'k^2 recursively; 
Let this basis be {b' = ®baj}iL2; 

Consider the basis {apf^i defined by ai = u and = X^7=2 ‘2 < i < n\ 

Converting this basis to a proper basis using Lemma 3 completes that algorithm; 

Algorithm 1: Computing ^-minimal basis 



To complete this section we give an algorithm for finding the ^-minimum ele- 
ment of a lattice A. We apply Lenstra’s algorithm for integer linear programming 
[13] that runs in polynomial time if the number of variables is constant. 

Lemma 4. If A C K” is a lattice of rank r then assuming n is a constant there 
is a polynomial time algorithm for finding the ^-minimum element of A. 

Proof. Let A = Z[bi, b 2 , . . . , b^] where the basis b^’s are given by b^ = ^ 

The algorithm goes in two steps. The first step is to find a vector with least 
£i norm. In the subsequent steps this solution is refined further till we get the 
^-minimum vector. 
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To find a vector with least norm we have the following integer program- 
ming problem in r variables a;i, X 2 , • . • , Minimize the expression /(x) = 
e”=iIe;=i bjiXjl given /(x) > 0. Although this is not an integer linear pro- 
gramming problem one can use the algorithm of Lenstra here as follows: For the 
2" different vector c G {—1, 1}”, solve the following integer linear programming 
problem and pick the best among those 2" different solutions. 



Minimize 



E 

i=i S=i 






under the constraints 



f j > 0 


(1) 






r 

bijXi < 1 E J E 

Z=1 


(2) 



The first constraints expresses the fact that the solution should be nonzero. 
The second set of constraints express the fact that we are choosing the right c^’s. 

The B in the equation is an upper bound on the £oo of the shortest vector. 
The ii norm of any particular vector in the basis will be a suitable value for B. 

Having obtained a solution for the £i-shortest vector say u one has to refine 
the solution to get a ^-minimum solution. Let Uj denote a £i-shortest vector 
which agrees with the ^-minimum vector on all coordinates less than or equal 
to j, then u„ is the desired solution. Let Ug = u. Having got the vector Uj to 
compute Uj+i we minimize ^ibij+i under the constraints || X)i=i II i = 
II Uj II and J2i=i ^ibik = Ujk for 1 < fc < j. Here Ujk denotes the component of 
Uj in the direction e^. One can use the same trick as before to convert this to 
an integer linear programming problem and use Lenstra’s algorithm. Since the 
dimension n is bounded, the running time is polynomial. 



Combining Lemma 4 and algorithm 1 we have the following result. 

Theorem 6. Given a lattice a basis {hi}^^^ of a rank r lattice A C K" there is 
a polynomial time algorithm to compute the ^-minimal basis of A assuming n 
to be a constant 



4 Units of a Number Field 

Let K he & number field of degree n and let O be the set of algebraic integers of 
K. If K has r real embeddings and 2s complex embeddings then by Dirichlet’s 
theorem (see, e.g. [8]) there exists a set of m = r -I- s — 1 units {£i}(Ti, called 
a fundamental system of units, such that every unit of O can be expressed 
as C,e^^ Xi G Z, where C is a root of unity in K. Consider the map 

Log : K !—>■ K™ defines as follows: 



Log(o;) = (In |a|i,ln |q;| 2 , . . . ,ln |a|m) 
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From Dirichlet’s theorem it follows that the set Log {O*) is a lattice in K™ 
with basis Log (£i)™ Often it is necessary to work with vectors in this lattice 
whose coordinates are in general irrationals. We will use rational approximations 
of these vectors instead. 

An important algorithmic task that will be useful in the algorithm is 

to compute a canonical fundamental system of units from a given fundamental 
system of units U = {£i}U In this section we give a polynomial time algorithm 
for the above task assuming that the degree [K : Q] is constant. The next 
theorem is a re-statement of [17, Lemma 16] using the bound in Lemma 2. 

Theorem 7. [17] There exists a fundamental system of units {eijfLi for O such 
that sizec{si) = {n.size{0))^^^'’ for all 1 < i < m. 

Consider the fundamental system of units i that corresponds to 

minimal basis of the lattice Log {O*). We have the following observation. 

Lemma 5. For all 1 < i < m, sizedrji) = {n.size{0))^^^\ 

Proof. The basis = Log ( 77 ^) is the ^-minimal basis of A = Log(0*). Let 
b* denote the corresponding GSO basis and let Vi = Spanjbi}™^. Let {£i}[Ti 
be any fundamental system of units satisfying the condition in Theorem 7 then 
Hj = Log (si) spans the lattice A. Without loss of generality we may assume that 
&i ^ Vi- Since b^’s form the ^-minimal basis of A we have || b* ||^ < || b* |]^ < 
II a,/V-i 111 < m II a, ||^. Hence we have || b, ||^ < || b* ||^ -b 5 || b* || < 

m.A.(i — l)/2, where A is an upper bound on |] a^ ||^. From Theorem 7 and 
Proposition 1 we have A < {n ■ size{0) + where i is such that 

Log(£i) has the largest £00 norm. Hence for every i we have lnH(r 7 i) < (n • 
size\o) + Together with Theorem 4 we have the result. 

We now describe how a canonical fundamental system of units can be com- 
puted given an arbitrary fundamental system of units. The following theorem, 
based on a remark in [17], will be useful. 

Theorem 8. [17] Assuming that [K : Q] is constant, there is a polynomial time 
algorithm that takes as input a principal ideal a = aO by its Thermite Normal 
Form and a good rational approximation for Log (a), and outputs a compact 
representation for fa where f is a root of unity in K. 

Remark 1. The point to note in the above theorem is that only Log (a) is given 
(as a rational approximation) and not the compact representation of a. Also 
notice that Log (a) is unique only upto multiplication by roots of unity in K. 
The theorem promises that one such element fa, which depends only on HNF(a) 
and Log (a), is computable in polynomial time. 

Theorem 9. Assuming [K : Q] is a constant, there is a polynomial time deter- 
ministic algorithm that takes as input a fundamental system of units ( as compact 
representations) and outputs another fundamental system of units {r]i})li (as 
compact representations) corresponding to the ^-minimal basis for Log((!?*). 
Furthermore, {r]i})fi is canonical in the sense that it does not depend on the 
input fundamental system of units. 
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Proof. Given a fundamental system of units compute {Log (£i)}™ ^ to the 

desired approximation. Compute the ^-minimal basis for the lattice generated 
by (Log (£i)}™ 1 using algorithm in Theorem 6. Let it be Use algorithm 

in Theorem 8 to compute compact representations of units i corresponding 

to the vectors Since the basis |bi}^^ is unique (upto approximation) 

and since all the algorithms involved are polynomial time deterministic algorithm 
the output generated is independent of the fundamental system of units that was 
given as input. 

Given any set of units |£i}, we now analyze the approximation of Log (si) 
required in order to accurately compute the canonical fundamental system of 
units. 

Let {rji} the canonical fundamental system of units and let = Log(? 7 i). 
Let hi = Log(£i). Consider the matrices A = (a^-) and B = (by) (recall that 
for a vector v, Vi denotes its component). Since a^’s and b^’s span the same 
lattice A = Log(0*), there is a unimodular transformation U G SLm(Z) such 
that A = UB. Note that the determinant of B is the regulator which is at 
least 0.2 and hence it can be shown that each entries of U is of size bounded 
by a polynomial in the sized of entries in A and B. Let Bq denote the q bit 
approximations of B and let Aq = UBq. We have 

II ^ ^ Iloo = II U(Bq - B) 11^ <m\\U 11^ 2-L 

If we take q large enough so that || — A || is small enough for us to recover 

back the compact representation of rifs we are through. It is easy to see that a 
q that is bounded by a polynomial in the sizes of entries of A and B is sufficient 
for this purpose. 

Lemma 6. In the algorithm of Theorem 9, it suffices to approximate Log (£j) 
to an error of 2~‘> where q < (lg(|| A |1^) lg(|| B ||oo))'^*'^^ • 

5 Computing Units is in 

In this section we give an algorithm for computing a fundamental system 

of units for a number field K. The algorithm is in two stages. In the first stage 
it computes a number B such that the regulator Rk lies in the range \B,2B). 
Notice that, having computed such a bound B, we can test in deterministic 
polynomial time if an arbitrary set of m algebraic numbers is a fundamental 
system of units. Given this value of B, in the second stage the algorithm 

computes a fundamental system of units of K. The first stage is described in the 
following lemma. 

Lemma 7. Given a number field K, there is an FP®^^ algorithm to compute a 
constant B such that the regulator of K, lies in the interval \B,2B). 

Proof. We give a polynomial time algorithm that makes UP-like queries to an 
NP oracle. Gonsider the following NP language: 

A = {(x, Ok) I there is a subgroup of index y in 0*k ■ x < yRx < 2a;}. 

We consider the following nondeterministic procedure that accepts A. 
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Input: A rational x and basis for the ring of integers, O of a number field K 
Output: “Yes” if there is a subgroup of O* of index y such that x < yRx < 2a;; 

“No” otherwise. 

1 Guess the polynomial sized compact representations of m units of O say 

Compute rational approximations of {Log and check if they form a linearly 

independent set. If not reject; 

Compute the volume of the parallelepiped formed by the {Log(ei)}™i a-nd check 
if it lies in the interval [x, 2x). If not reject; 

2 Use the algorithm in Theorem 9 to compute a canonical fundamental system of 
units say Tji’s; 

Check the whether the compact representations obtained in step 2 is same as the 
guessed compact representations. If yes accept else reject; 

We now explain Step 1. First guess m (polynomial sized) compact represen- 
tations of m algebraic integers Applying Theorem 4 it is possible to 

compute aiO and check whether aiO = O in polynomial time. 

In the above NP machine if x is such that x < Rk < 2x then there will be only 
one accepting path. This is because any set of m units that was guessed in step 1 
will indeed be a fundamental system of units. For each of these accepting paths, 
step 2 will give a unique compact representation of a fundamental system of 
units — those units that in the Log map gives the ^-minimum basis of Log {O*). 
Hence the only path that will accept is that which guessed that unique compact 
representation of units corresponding to the ^-minimal basis of Log (O*). 

It is known that the regulator of any number field is at least 0.2 [8]. We now 
describe the procedure that computes the required bound B: 

Input: A Z-basis for the ring of integers, C2, of a number field K 
Output: A rational B such that B < Rk < 2B 
B ■- 0.2; 
while true do 

if (B,Ok) € A then return B; 

B ■- 2B- 
end 



Since this procedure makes UP-like queries to the NP language A we can 
convert it into a FP®^^ algorithm by Lemma 1. 

Lemma 8. Given a constant B such that B < Rk < 2B, a fundamental system 
of units can be computed in FP®^^ . 

Proof. First, consider the following nondeterministic polynomial time machine 
M. The machine M first guesses a set of m algebraic integers in their compact 
representation and then verifies in polynomial time that the guessed algebraic 
integers indeed form a fundamental system of units by first checking whether 
they are indeed units (check if aO = O) and then calculating the volume of 
the parallelepiped (in the Log map) formed by the vectors corresponding to the 
guessed units, by a determinant computation. If this volume does not lie between 
B and 2B, the machine M rejects on this computation path. Otherwise, applying 
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Theorem 9 along with the guessed fundamental system of units as input, the 
machine M now computes a canonical fundamental system of units and checks 
if it coincides with the guessed fundamental system of units. If they do coincide 
the machine M accepts along this computation path. It is clear from the above 
description that the nondeterministic machine M has a unique accepting path. 
Applying Lemma 1, we can now design from M an algorithm that will 

compute a fundamental system of units, if it is additionally given B such that 
B < Rk < “2B. 

Now, combining Lemmas 7 and 8 we immediately obtain the following. 

Theorem 10. There is a algorithm to compute a fundamental system of 

units of the ring of integers of a number field K assuming that the degree [K : Q] 
is a constant. 

6 Principal Ideal Testing is in SPP 

Given a number field K and a Z-basis for its ring of integer O, the principal ideal 
testing problem (denoted by PrI) problem is to check if an a of O is a principal 
ideal. We show that this problem is in SPP. 

Theorem 11. Given a number field K with ring of integers O and the Z basis 
of a ideal a, checking whether a is principal is in SPP, assuming [K : Q] zs a 
constant. 

Proof. Without loss of generality we can assume that o is an integral ideal. First 
compute a fundamental system of units of ^ using the algorithm in 

Theorem 10. Guess the compact representation of an algebraic integer a. Gheck 
if aO = a if not reject. Gheck if Log (a) lies in the fundamental parallelepiped 
{x G R™ : X = ^ OjLog (si ) , G [0, 1]}. If not reject. Next, apply Theorem 8 to 
obtain a compact representation of a' from Log (a) and a, such that Log (a) = 
Log (o'). If the compact representations of a and a' coincide we accept on that 
computation path and reject otherwise. The correctness of the easily follows from 
Theorem 8 and the fact that for every a G O there is a unique associate in the 
fundamental parallelepiped. 



6.1 Computing the Class Number 

Finally, we show that if we assume the generalized Riemann hypothesis (GRH) 
then finding the class number and a presentation for the class group are in 
FpSPP. It is shown by Bach [2] that if the GRH is true then the class group of 
any number field K is generated by the ideal classes of all non-inert prime ideals 
of norm less L = 121n^|Z?|. Let pi,...,pjv be the (polynomially many) ideal 
classes of all non-inert prime ideals of norm less L = 12 In^ \D\. We can compute 
these ideals pi in polynomial time as explained, for example in [8, Section 6.2.5] 
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Our goal is to compute the class number and a generator-relator presentation 
of the class group of K. Let Gi denote the subgroup of Cl{K) generated by 
{pi,..., pi} and let Gq = {id}. For each i let U be the least positive integer 
such that p-“ G Gi_i. Let s^, 1 < i and 1 < j < f be integers such that 
0 < Sij < tj and such that p ~ Pj’^ ■ The set of relators defined by 



together with the generator set |pi,...,piv} gives a generator-relator presen- 
tation of Cl{K). Furthermore, notice that the Sij’s are unique in the range 
0 < Sij < tj. Also, ti is the order of the ideal class of pi in Cl{K). We will 
describe an procedure for computing the set R inductively as follows. 

Assume that the set of relators 

R^ = \p) = \{pT - 

is already computed (where i?o = 0)- It suffices to give an FP®^^ procedure 
for computing i?i+i. To this end, we define an NP®^^ language A as follows: 
A consists of the set of tuples (x, y, |aj}}^i, {imj,rij)}}A\) such that there is a 
X <t < y and rrij < Sj < rij such that a* ~ rij=i > where Oj ’s are ideals in 
Ok given by their HNFs. It is easy to see that the language A is in NP®^^: guess 
t and the Sj’s and verify the class group identity by applying the SPP algorithm 
for principal ideal testing in Theorem 11. 

The following code is a polynomial-time oracle computation (with oracle A) 
that computes Ri+i from Ri. 

Let T := 1; 
while true do 

if (T,2T, {p3}'t},{(l,tj)}5=i) e A then break else T:=2T 
end 

Do a binary search for ti+i in the range [T, 2T) using A as oracle.; 

Next, do a binary search to compute the s^’s. (* These Sj’s are actually the Si+p- 
in the definition of R. *) 

It is easy to see that in the above algorithm only UP-like queries are made to 
A. More precisely, the queries will be of the form (x, y, {pj}j^i, {(jTij , Uj)}}'!}) , 
with parameters such that the NP®^^ machine for A will have at most one ac- 
cepting path. The queries are UP-like as Ri is a set of relators for Gi. Now, 
using closure properties of SPP and Lemma 1 we can transform the above algo- 
rithm to an FP®^^ procedure that inductively computes the generator-relator 
presentation of the class group Gl{K). Observe that the class number is given 
by rifc ti. Hence we have the following theorem. 
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Theorem 12. Assuming the GRH, the class number and a generator-relator 

presentation for the class group of a constant- degree number field can be computed 

■ TTTjSPP 

zn IP 
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Abstract. We provide explicit formulae for realising the group law in 
Jacobians of superelliptic curves of genus 3 and (73,4 curves. It is shown 
that two distinct elements in the Jacobian of a (7a, 4 curve can be added 
with 150 multiplications and 2 inversions in the field of definition of the 
curve, while an element can be doubled with 174 multiplications and 
2 inversions. In superelliptic curves, 10 multiplications are saved. 



1 Introduction 

The interest in the arithmetic of low genus algebraic curves has been spurred 
by the fact that their Jacobians provide attractive groups to implement discrete 
logarithm based cryptosystems. While the attention first focused on the simplest 
types of curves, namely elliptic and hyperelliptic ones, it is shifting towards more 
complicated ones in the form of superelliptic and more generally (7o,& curves. At- 
tacks on high [5,6] and medium genus hyperelliptic cryptosystems [9,20] make 
it seem advisable to concentrate on curves of genus at most 3, which leaves 
superelliptic cubics and (7a , 4 curves. There are a number of generic algorithms 
for computing £-spaces of arbitrary curves, thus implementing the arithmetic of 
their Jacobian groups, like [13,15], to cite only the most recent ones. For superel- 
liptic and Ca,b curves, faster special purpose algorithms have been developed, 
relying either on Grobner basis computation [1] or LLL on polynomials [8,12,3]. 
None of these articles provide a precise count of the number of operations for 
the arithmetic of non-hyperelliptic curves of low genus. 

In [2], the present authors identify special Jacobian elements, called “typi- 
cal”, that, while admitting a special simplified representation by two polynomials 
(cf. Section 2), yet cover the major part of the Jacobian. (In fact, in the cryp- 
tographic context of a large finite base field and randomly chosen elements, one 
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D.A. Buell (Ed.): ANTS 2004, LNCS 3076, pp. 87-101, 2004. 

© Springer- Verlag Berlin Heidelberg 2004 




A. Basiri et al. 



does not expect to ever encounter non- typical elements.) Then two algorithms 
for adding typical elements are developed. The first algorithm is inspired by Can- 
tor’s algorithm for hyperelliptic curves. The second one is obtained by applying 
the FGLM algorithm for changing the ordering of a Grobner basis, and yields 
explicit formulae for the group law in terms of operations with polynomials. 
These two algorithms for superelliptic cubics generalise readily to curves. 
A first quick implementation in the superelliptic case required about 250 mul- 
tiplications in the underlying field for the algorithm a la Cantor, and about 
200 multiplications for the formulae on the polynomial level, thus improving by 
a factor of 3 on our implementation of the algorithm of [8] . 

Flon and Oyono use a slightly different approach in [7] to obtain explicit 
formulae for the group law on typical elements.^ The formulae they give for 
superelliptic cubics require 156 resp. 174 field multiplications (depending on 
whether two distinct elements are added or an element is doubled) and 2 inver- 
sions in the field. For Gaq curves, they announce 177 resp. 198 multiplications 
and again 2 inversions, without providing further details. 

In the present article we show how a carefully optimised implementation of 
our formulae allows to save up to 16 multiplications for superelliptic curves and 
27 multiplications for Gsq curves compared to [7]. We explain in detail how we 
obtained a straight line program, suited for a straightforward implementation in 
constrained environments such as smartcards, from the formulae manipulating 
polynomials, and for the first time we provide details for G34 curves. 

In the following section, we give a concise overview of superelliptic cubics and 
G34 curves and relate without proof the algorithm developed in [2]. After a few 
remarks on the underlying field arithmetic in Section 3, we present in Section 4 
algorithms for computing with low degree polynomials. These algorithms serve 
as a toolbox for transforming the formulae on the polynomial level into formulae 
on the coefficient level, which we describe together with an exact operation count 
in Section 5. 



2 Jacobians of Curves 

Let AT be a perfect field. A superelliptic curve of genus 3 or Picard curve over 
AT is a smooth affine curve C of the form C : = f{X) with / € K[X] 

monic of degree 4; a G34 curve is more general and may additionally have a 
term of the form h{X)Y with h G K[X] of degree at most 2. The place at 
infinity of the function field extension K{C)/K{X) corresponding to such a 
curve is totally ramified and rational over K, whence the A'-rational part of the 
Jacobian of C is isomorphic as a group to the ideal class group of the coordinate 
ring K[C] = K\X,Y]/{Y^ + KY — f). By the Riemann-Roch theorem, each ideal 
class contains a unique ideal a of minimal degree #(AT[G]/a) not exceeding 3, 
and this ideal is called the reduced representative of the class. In the following, 

^ Technically speaking, they compute a minimum with respect to the Gs,4 order in 
the ideal itself and end up in the inverse class, while we compute a minimum in the 
inverse class and end up with a reduced representative of the ideal itself. 
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we shall only consider ideals not containing two distinct prime ideals above a 
rational prime of AT[Ai]. Such an ideal can be written as 

a = (u,Y — v) with u, v G K[X\, u monic, degu < degu and + hv — f] 
its degree is exactly deg u. 

Most ideal classes (a proportion of 1 — ^O(l) in the case of a finite ground 
field Fq) are represented by such an ideal satisfying furthermore deg u = 3 and 
degw = 2; we call these ideals “typical”. Conversely, it is shown in [2], that 
for superelliptic curves of genus 3 a typical ideal is automatically reduced, and 
the proof carries over to (73^4 curves. In the remainder of this article, we shall 
only consider typical ideals and the cases where ideals and polynomials behave 
in a typical way (for instance, the remainder of a polynomial of degree at least 
n + 1 upon division by a polynomial of degree n is supposed to be n — 1). 
Whenever these assumptions do not hold (which one does not expect to happen 
in practice) , one may have recourse to a slower generic algorithm. Alternatively, 
one may develop specific formulae, that actually turn out to be simpler than in 
the common case. 

The product of two ideal classes, represented by ideals = {ui,Y — Vi), 
degUi = 3, degfi = 2 , i G {1,2}, is obtained in two steps. The composition 
step corresponds to ideal multiplication and yields a = {u,Y — v) = ai02. Here, 
u = U1U2 is monic of degree 6, and v of degree 5 is computed by interpolation. 
In the addition case, where ui yf U2 (and typically u\ and U2 are coprime), v is 
obtained by Chinese remaindering as follows: 

Si = mod U2', t = Si{v2 — ui) mod U2; v = vi + tu\. (1) 

In the doubling case ai = 02, a Hensel lift yields 

2 , .1-1 . vf + hvi - f 

S3 = (3ui + n) mod Ui; W\ = ; t = — S3W1 mod Ui] 

Ui 

V = vi + tu\. (2) 

The reduction step takes as input an ideal a = {u,Y — v) with u of degree 6 
and V of degree 5, and outputs an equivalent ideal 0! = {u' , Y — v') of degree 3, 
which by [2] is the reduced representative of its class. Let e be the minimum 
with respect to the €3^4 order of the ideal {u, + vY + + h) in the class of 

that is, e is the element whose pole at infinity has minimal multiplicity. It 
is shown in [2] for superelliptic curves and easily generalised to curves that 

e = tY^ + <pY + 

where the polynomial t, of degree 2, is obtained by executing two steps of the 
extended Euclidian algorithm on u and v, and ip = tv mod u; otherwise said, 
there is a linear polynomial s such that su + tv = p of degree 3. Moreover, 
i) = t{v"^ + h) mod u. Then the reduced ideal a' = ^0.= {u', Y — v') is computed 
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as follows: 



u 

Lp'^ —t4) + t^h 

M := 

u 

, _ f + V^) +ip^ + h (t-ipith - 2-ip) + + (fi-ip)) 

v' = pT^X mod u' 



Here, all divisions by u are exact, that is with remainder zero. 



(3) 



3 Field Arithmetic 

In the previous section, we have shown how to realise the arithmetic of C 34 
curves using a representation by polynomials. The main goal of this article is 
to adopt a lower level point of view and to provide formulae for the coefficients 
of the output polynomials in terms of those of the input polynomials. Thus, 
the operations we consider as elementary are operations in the field defining the 
curve. Our main motivation being potential cryptographic applications, we have 
(not too small) finite fields in mind, but the final formulae will hold for any field. 
However, for certain optimisations to work, we exclude fields of characteristic 2, 
3 and 5. (As a side note, a superelliptic cubic is singular in characteristic 3, while 
in characteristic 2 , it has a special structure, which might make it less attractive 
for cryptography, cf. [10].) There is only one division by 5 in our formulae; if 
need be, it could be removed at the expense of a few extra multiplications. 

There are a thousand and one ways of organising the computations, so an 
optimality criterion is needed. Naturally, this should be the running time of a 
group operation in the Jacobian, which will ultimately depend on the concrete 
implementation and the concrete environment. A reasonable and theoretically 
tractable approximation is the number of elementary field operations. In many 
situations (over not too small finite fields, for instance, but not over the rational 
numbers) additions and subtractions take a negligible time compared to mul- 
tiplications and inversions (divisions being realised as an inversion followed by 
a multiplication). Notice that multiplications by small natural numbers can be 
realised by a few additions, so they come for free. 

Moreover, we do not count divisions by small constants (precisely, 2, 3 and 
5) either. For instance, in a prime finite field Fp, represented by {0, ... ,p — 1}, 
division of an element a by 2 is trivial: either a is even, then it may be divided 
by 2 as an integer; or it is odd, then a -I- p is even and is a representative 
of the result in {0, . . . ,p— 1}. Slightly less straightforward, a division of a by 3 
may be realised by first computing the remainder of a upon division by 3. Since 
4=1 (mod 3), this is a matter of splitting a in blocks of two bits using bit 
masks and shifts and adding these base 4 digits, much as the test for divisibility 
by 9 of a number in decimal notation. Then, one adds the appropriate multiple 
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of p and divides by 3. For the remainder modulo 5, an alternating sum of base 4 
digits may be used. In a general finite field, represented as a vector space over 
Fp, divisions by small constants can be carried out coordinate wise. 

Finally, this leaves us with two variables to minimise, the number of multipli- 
cations and the number of inversions, and there is a trade off between these two. 
Depending on the library for finite field arithmetic, an inversion usually takes 
between three and ten times as long as a multiplication. We therefore tried to 
eliminate as many inversions as possible, as long as this introduced only a few 
extra multiplications. For instance, two independent inversions of field elements 
a and b may be replaced by one inversion and three multiplications as follows: 

u= {a ■ b)~^ , a~^ = u - b, b~^ = u ■ a. 

4 Polynomial Arithmetic 

Once the algorithm of Section 2 exhibited, the remaining task is no more con- 
nected to geometry, but rather to the topic of symbolic computation. The ratio- 
nal formulae given there are expressed in terms of operations with polynomials; 
it remains to phrase them in terms of their coefficients. A straightforward imple- 
mentation of the polynomial arithmetic involved is of course trivial; the problem 
of minimising the number of field operations, however, appears to be hard. The 
only feasible approach we have found consists of performing local optimisations 
on pieces of the formulae. 

In this section, we review different approaches and algorithms of polynomial 
arithmetic useful for this task. All of them are well-known in the symbolic compu- 
tation community; however, textbooks often focus on the asymptotic behaviour 
of the algorithms and do not treat the very small instances we are interested in. 
While commenting on our choices for the concrete case of ( 73,4 curves, we hope 
that the following overview will be helpful in further situations. 

4.1 Multiplication 

Let M(m,n) denote the number of field multiplications carried out for multi- 
plying two polynomials with m resp. n coefficients, that is, of degree m — 1 and 
n — 1; and let M{n) := M{n,n). If useful, we indicate the employed algorithm 
by a subscript. For instance, the “naive” method has M{m,n) = mn. A trivial 
improvement arises when one or both polynomials are monic; in the latter case, 
the equation 

(X™-i + /(X))(X"-i+5(X)) = X-+"-2 + /(X)X"-i + g(X)X™-i + (/g)(X) 
shows that at the expense of a few additions, only n—1) multiplications 

are needed. 

A substantial improvement is obtained by Karatsuba’s multiplication [16]. 
Using the relation {aX + b){cX + d) = a-cX‘^+ (^{a + b) ■ {c+ d) — ac— bd) X + b - d, 
it achieves Mk( 2) = 3, and, by recursively splitting the polynomials in half, 

Mk( 2’”, 2”) = 2™-”3” for m>n. 




92 



A. Basiri et al. 



Generalisations by Toom [21] and Cook [4] to the product of two polynomials 
with three coefficients yield Mtc(3) = 5, and the analogous recursive strategy 
may be applied. 

4.2 Exact Division 

The simplest way of computing the quotient and remainder of w = VnX'^ + 
Vn-iX^~^ + • • • divided by m = UmX”^ + Um-iX'^~^ + • • • is the schoolbook 
method: invert Um', the first term of the quotient, u^v„X'^~'^, is then computed 
with one multiplication; multiply it back by u, subtract from v; and continue in 
the same way. 

If the remainder is of no interest or known to be zero, then it is not necessary 
to multiply back by all of u. In fact, it suffices to consider only the leading 
terms of u and v, while the lower ones determine the remainder. The k leading 
coefficients of the quotient are obtained with one inversion and 

1 + • • • + fc (4) 

multiplications. If moreover u is monic, then the inversion and the multiplications 
by are saved, resulting in 



1 + --- + (A:-1) (5) 

multiplications. Letting k = n — m + 1 yields the full quotient. As observed 
by Jebelean [14] in the case of integer division, if the division is exact, that is, 
the remainder is zero, then it is also possible to work with only the trailing 
coefficients of u and v from right to left; in fact, his algorithm amounts to the 
division of the reciprocal polynomials of u and v. The number of operations is 
unaffected, but the algorithm deploys its benefits when used to work from both 
sides simultaneously as suggested by Schonhage in [19], using the k lowest and 
n — m + 1 — k highest coefficients for some k. For given u and v, the value 
k = J is optimal. If the effort for computing u and v is to be taken into 

account, other choices may be preferable; for instance, we shall use k = 1 most 
of the time. 

4.3 Short Product 

As seen in Section 4.2 on exact divisions, one is sometimes interested in only the 
trailing (or leading) coefficients of a polynomial. If this polynomial is the result 
of a multiplication, then these coefficients are obtained by what is known as a 
“short product” . In the case of trailing coefficients, it can be seen as the product 
of truncated power series, returning upon input of two polynomials u and v of 
degree n — 1 their product modulo X". Instead of computing the full product 
of u and v and then truncating, Mulders suggested the following algorithm [17]: 
choose a cutoff point k > ^, and write 

u = uo + uiX'^ and w = uq + 
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Then uv mod X” is computed recursively as 

uqVo + ((uqVi mod + (uiVq mod 

by a full and two short products. If S(n) denotes the number of field multi- 
plications required for computing a short product modulo X", we obtain the 
recursive formula 



S(n) = M(k) + 2S(n-k). 

Hanrot and Zimmermann [11] showed that if the full product is computed by 
Karatsuba’s algorithm, then the optimal cutoff point k is the largest power 
of 2 not exceeding n. For instance, a short product modulo X^ is computed 
by a full Karatsuba product of order 2 and two further field multiplications, 
resulting in altogether 5(3) = 5 multiplications. This is the same number as 
for the full product by the Toom-Cook approach, but fewer additions and no 
division by 3 are required. It is to be expected that with Toom-Cook as the basic 
multiplication method, the optimal cutoff point will be once or twice a power of 
3; then 5(4) becomes 5(4) = Mtc( 3) -I- 25(1) = 7. To generalise to products of 
polynomials of different degrees, let for d > m > n the value S{m,n;d) denote 
the number of multiplications required to compute the product modulo X'^ of 
a polynomial of degree m — 1 with one of degree n — 1. Then for a cutoff point 
k < n we obtain 

5(m, n; d) = M{k) + S{m — k,k;d — k) + S{n — k,k;d— k). 

For instance, 5(4, 3; 4) with Toom-Cook and fc = 3 becomes 

5(4, 3; 4) = Mtc( 3) -k 5(1, 3; 1) -k 5(0, 3; 1) = 5 -k 1 -k 0 = 6. 



4.4 Interpolation 

Karatsuba and Toom-Cook multiplication essentially work by evaluating the 
factors at small arguments (say, 0, 1, —1, 2, •••), multiplying the values and 
interpolating the result. Additionally, they treat the values at “ 00 ”, that is 
the leading coefficients, separately. Of course, this approach can be extended 
to higher degree polynomials as long as the base field has a sufficiently large 
characteristic so that the interpolation points are different. Evaluating a (low 
degree) polynomial in small integers requires only additions and multiplication 
by (small) constants, interpolation uses also divisions by (small) integers. So in 
our model, these steps cost nothing. One thus obtains a complexity of 

M{m, n) = m + n — 1. 

But the interpolation approach is not limited to simple multiplications; it can 
be extended to arbitrary polynomial and even rational formulae as long as the 
result is a polynomial, that is, all divisions are exact. For instance, the polyno- 
mial u' as computed in (3) is monic of degree 3, so it can be reconstructed by 
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computing the values of v! — in 0, 1 and —1. Each value requires as many 
field multiplications as there are polynomial multiplications in the numerator of 
the formula, after adding parentheses and suitably reusing common subexpres- 
sions, and additionally a squaring and an inversion of the corresponding values 
of u in the denominator and a multiplication by the inverses. As mentioned in 
Section 3, the different inversions may be pooled into only one if one is willing 
to spend more time with multiplications. 

4.5 Extended Euclidian Algorithm 

The classical extended Euclidian algorithm, upon input of two polynomials r_i 
and ro with degr_i > degrp, computes the greatest common divisor d of r_i 
and rg together with multipliers a and b such that 

d = ar_i -I- 6rg. 

It proceeds by iterated divisions with remainder, until the remainder vanishes, 
and thus requires a certain number of inversions. The greatest common divisor 
is only defined up to multiplication by constants, and it is possible to modify the 
algorithm to use pseudodivisions without field inversions (this is essentially the 
subresultant algorithm). Keeping track of the multipliers u and v then requires 
extra multiplications. Assume by induction that remainders and and 
multipliers a^-i, Oj, bi-i and bi are given such that 

= a*_ir_i -I- b^-iTo and n = atn-i + &irg; 

the initial values being a_i = 6g = 1, Ug = b_i = 0. 

Let £ be the function that to a polynomial associates its leading coefficient. 
Then the pseudodivision of by yields a quotient gi+i and a remainder 
ri+i such that 

^(^^)degn-i-degn+V^_^ = ft+ir, + r,+i, 

Oi+i = - (7,+iUi and &i+i = - 

q^+lbi satisfy n+i = ai+iV-i + &i+irg. 

We analyse the most common case in more detail, where degrg = degr_i — 
1 and the remainder degrees drop by 1 in each step, that is, all the qi have 
degree 1. Letting n-i = aX^ + f3X^~^ + ■ ■ ■ and r* = -I- SX'^~‘^ + • • • , 

the next quotient is computed as = "/-aX + {"f-P — S - a). This requires 
3 multiplications in the general case, 1 multiplication if r^-i or are monic and 
comes for free if both of them are monic. 

The next remainder is obtained as = £{riY ■ Vi-i — qi+i ■ r*, using 

interpolation with 1-1-2 deg multiplications (or just deg Vi if is monic) . 

Now, £{ri)'^ is known, and computing Oi+i is free for i = 0, and requires 
1 multiplication for i = 1 and M(l, deg ai_i-|-l)-l-M(2, degOj-l-l) = i—l+M{2, i) 
multiplications for i > 2. If furthermore rg is monic, then 02 = —qi comes also 
for free, and 03 = £{r 2 )'^ + <Zi • 92 requires only M(2) = 3 multiplications. 
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Obtaining is free for i = 0 and requires M(2) = 3 multiplications for 
i = 1 and M(l,deg5i_i + 1) + M(2, deg bi + 1) = i + M{2,i + 1) multiplications 
for i>2. 

The following table summarises the number of multiplications carried out to 
compute the greatest common divisor, a and b, depending on the degree n of 
ro, that is, the number of division steps, and on the monicity of r_i and tq. It 
is assumed that polynomials are multiplied using interpolation as explained in 
Section 4.4. 



n 


generic 1 


r_ 


1 monic 1 


ro 


monic 1 




gcd 


a 


b 


gcd 


a 


b 


gcd 


a 


b 


1 


6 


0 


0 


4 


0 


0 


2 


0 


0 


2 


14 


1 


3 


12 


1 


3 


7 


0 


3 


3 


24 


5 


9 


22 


5 


9 


16 


3 


9 


4 


36 


11 


17 


34 


11 


17 


27 


9 


17 



4.6 Modular Division by Linear Algebra 

One ingredient in our formulae for superelliptic arithmetic is the computation 
of w' = /i“^A (mod u), where /i is of degree 1, A of degree 2, and u monic of de- 
gree 3. This computation can be carried out by first determining jjL~^ mod u by 
the extended Euclidian algorithm as described in Section 4.5 and division by the 
greatest common divisor, a constant in the base field; then multiplying by A and 
finally reducing modulo u. In our implementation, these steps require 22 mul- 
tiplications and one inversion. In this section, we describe a different approach, 
saving 2 multiplications. The problem to be solved is much less generic than 
those of the previous sections; the proposed solution, relying on linear algebra, 
is quite general, however, and may also be applied to different constellations of 
degrees. 

Write fj, = fiiX + /xq, A = \2X'^ + XiX + Aq and v' = X2X'^ + x\X + xq with 
unknown X2, x\, xq. Then, by degree considerations, there is a further unknown 
value 7 such that /xx; -I- yu = A. Comparing coefficients, we obtain a system of 
four linear equations in four variables, in which the equation 7 = — 111 X 2 can be 
substituted immediately. Performing Gaussian elimination yields the solution 

X2 = a~^(i 

+ IJ-1U0X2) 

Xl = + fll{uiX2 - Xq) 

with 

a = {n\ui + 

[3 = — Ai(/io/xi) -I- Ao/ij -I- A2/iQ- 

a and (3 are computed with 11 multiplications. Then a and hq are inverted 
simultaneously with 3 multiplications and one inversion as described in Section 3. 
The computation of X 2 , xg and Xi then requires 14-2-1-3 = 6 multiplications 
(reusing the expression fiiUg needed for a), for a total of 20 multiplications and 
one inversion. 
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5 Explicit Formulae 

In this section, we develop explicit formulae for the composition and reduction 
steps, counting precisely the number of field multiplications and inversions. To 
keep the presentation readable, we use the building blocks of polynomial arith- 
metic of Section 4; expanding these to formulae involving only field operations 
is a trivial task. 

We use two little tricks to speed up the computations. First, it is possible to 
complete the fourth power in /; after the linear change of variables X X — 
we may assume that / = X"^ + f 2 X'^ + fiX + /q. The impact of this observation 
is not very high, since / is hardly used in the formulae (it is mainly implicitly 
present in the relations + vh = f (mod u)). 

Second, the composition step involves the extended Euclidian algorithm, and 
the resulting greatest common divisor is normalised to be 1. This normalisation 
step requires an inversion, which can be saved by modifying the output of the 
composition to be {u, v, d) with d in the base field such that the real ideal product 
is given by (u, T — d~'^v). The composition step being of little interest per se, 
there is no harm done as long as this modification is taken into account in the 
reduction process. 

5.1 Composition Addition 

Theorem 1. On input of two ideals ai = {ui,Y — ui) and a 2 = (m 2 ,T — V 2 ) 
such that Ui\vf + Vih — f and gcd(ui,U2) = 1, and assuming the typical behaviour 
of the remainder degrees during the Euclidian algorithm, polynomials u and v 
and a field element d such that Uia 2 = {u,Y — d~^v) can he computed with 
37 multiplications. 

Proof. We first compute u = U 1 U 2 by Toom-Cook multiplication with 5 field 
multiplications. Then, implementing (1), we determine si of degree 2 and d such 
that siUi = d (mod U 2 ) by applying the extended Euclidian algorithm to U 2 
and Ui — U 2 . According to the table in Section 4.5, with n = 2 and r_i monic 
this needs 12 -|- 3 = 15 multiplications. We then compute t\ = Si(u 2 — wi) with 
5 multiplications and the quotient q of the result by U 2 with 1 multiplication 
according to (5). The polynomial t = t\ — q ■ U 2 oi degree 2 is then obtained 
by interpolation on three points with 3 multiplications for the values of q • U 2 . 
Finally, v = d ■ Vi + Ui ■ t is computed with M(l, 3) -I- M(3) = 8 multiplications. 

5.2 Composition Doubling 

With the preparations of Section 4, the composition part of doubling is as 
straightforward as addition, but it requires noticeably more operations. 

Theorem 2. On input of an ideal ai = {ui,Y — Vi) such that Ui\vf + Vih— f, 
and assuming the typical behaviour of the remainder degrees during the Euclidian 
algorithm, polynomials u and v and afield element d such that af = {u, Y—d~^v) 
can he computed with 61 multiplications. 
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Proof. We use the notation introduced in (2). First, we compute u = u\ and 
Vi with 5 multiplications each. To obtain w\, we start with the four highest 
coefficients of vf + vih — f = vf ■ {vi + h) — f using 5(4, 3; 4) = 6 multiplications 
(see Section 4.3). An exact division by Ui yields Wi with 6 multiplications as 
shown in (5). The next step is to determine S3 of degree 2 and d in the base field 
such that S3 • (3^4 + h) = d (mod ui). By the table in Section 4.5, this requires 
19 multiplication since n = 3 and r_i is monic. 

We then reduce w\, which is of degree 3, modulo the monic ui by subtracting 
the appropriate multiple of ui, obtained with 3 multiplications; multiply by 
S3 with M(3) = 5 multiplications and reduce again modulo ui, which takes 
1 multiplication for the quotient and 3 multiplications for the remainder, yielding 
t in a total of 12 multiplications. 

Finally, v is obtained multiplying v\ by d and t by m with M(3, 1) + M(3) = 8 
multiplications . 

Notice that for adding as well as for doubling, the composition step is not 
more costly on (73^4 than on superelliptic curves. 



5.3 Reduction 

Theorem 3. On input of an ideal a = (u,V — d~^v) such that + 

d~^vh — f , u monic of degree 6, v of degree 5, the reduced representative a' = 
{u' ,Y — v') in the ideal class of a can be computed with 113 multiplications and 
2 inversions. In the superelliptic case, 10 multiplications may he saved. 

Proof. To facilitate keeping track of the total number of multiplications, from 
time to time we provide their balance, having the number for the superelliptic 
case precede that for the general <73,4 case. 

0/0 



We use the notation of Section 2. Let furthermore v = d~^v, and denote the 
coefficient of a polynomial in front of by a subscript i. The first step of the 
algorithm consists of finding the minimum e = tY^ + ipY + ip of a with respect 
to the <734 order, where the polynomial t, of degree 2, is obtained by executing 
two steps of the extended Euclidian algorithm on u and v, (p = tv mod u is of 
degree 3 and ip = tv'^ + th mod u is of degree 5. Notice that e is defined only up 
to multiplication by constants; we shall compute the representative with leading 
coefficient 1 for the Cs ^4 order, that is, with ip monic. Inspection of (3) shows 
that then also u' will be monic, and no further normalisation will be needed. 

To avoid inverting d, we shall use v in the place of v, and correct the poly- 
nomials later on. Thus, we determine t of degree 2, p of degree 3 and a linear 
polynomial C such that p = tv mod u = t ■ v — C, ■ u. Using the algorithm of Sec- 
tion 4.5, the computation of t and f requires 17 multiplications. By carrying out 





98 



A. Basiri et al. 



the Euclidian algorithm symbolically and simplifying the resulting formulae by 
hand, one may save 2 squarings as follows. The 5i designate temporary variables. 

5i=U4^-v^~ Vs 
= Us ■ Vs - h4 
Js = • Vs — (52 • V4 

?2 = (5s • Vs 
Cl = ^2 • Vs 

(54 = ^3 • Us 

(5s = ^2 • V3 + (54 - Vs • (vs • U3 - V 2 ) 
ii = Ss ■ Vs 
Co = Vs • (ti — S2 ■ S3) 
to = ((5s ~ ^ 4 ) • (52 + (5i • ^3 

15/15 



The polynomial ip is obtained via interpolation from t, v, u and C with 8 mul- 
tiplications. Then, we compute polynomials tp and C such that 

ip = ipv mod u = (f ■ V — ^ ■ u, 

the correction by the additional term th in the definition of ip being postponed. 
Computing the 3 leading coefficients of (pv takes S'(3) = 5 multiplications, and 
the three coefficients of the quotient ^ by the monic u are obtained with 3 multi- 
plications. Then ip may be computed by interpolation on six points with 12 mul- 
tiplications. 

43/43 



Since we worked with v instead of v, the polynomials t, (p and ip have to be 
adjusted by powers of d, at the same time as making ip monic. We profit from 
the inversion of ips by computing simultaneously the inverse of uq, which will be 
needed later on, with 3 multiplications and one field inversion as described in 
Section 3. Then, we obtain the minimum e via 

t = ii’s^d ■d)-i, ip = {ip^^ ■ d) ■ >p, Ip = ip^^ ■ -p + th. 

As will become clear in the following, we do in fact not need the coefficients tpi , 
pi and p 2 , whence this step can be carried out with 11 multiplications in the 
superelliptic case. When h 0, the computation of toho requires an additional 
multiplication, and ps and pi need the two leading terms of th, obtained with 
S{2) =3 multiplications. 

57/61 



Define the polynomials A and p of degree 2 and 1, respectively, as in the 
equations before (3). In the following, we shall perform polynomial arithmetic 
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“from both sides” as described in Section 4.2. All constant coefficients are com- 
puted separately. For instance, Aq and /to are obtained with 7 multiplications 
if = 0, including those by the already computed Uq^. For <73^4 curves, /to re- 
quires additionally the computation of ighg, which is obtained with one extra 
multiplication since to ho has already been used for t/>o- 

For /ti = —t 2 , there is nothing to do. Taking into account that /4 = 1, /s = 0 
and = 1, the numerator of A starts with 

(t2 - + (2ti ■ t2 - (fi2 - (P3 ■ Ip4)x'^ H , 

and these coefficients are computed with 3 multiplications. The two leading terms 
of the quotient by u require another multiplication, so that the total number of 
multiplications for A and /r becomes 11 in the superelliptic and 12 in the Csq 
case. 

68/73 

The polynomial u' of the result is computed via (3). For ft, = 0, the constant 
coefficient Wq is easily seen to be computable with 7 multiplications, reusing 
values like (fo already needed for Aq or /iq. If ft yf 0, then the term 

ho ■ (ffiV’o • {toho - 2 ipo) + T’o • (^o/o + 7’oV’o)) 
requires only the 3 additional multiplications marked with a dot. 

75/83 

For the high degree part of u' , we compute the leading terms of the numerator 
Afis -/ -/ . . . of (3) as 

a = t2 • (A 2 - 2(v 33 -I- ft2)) -I- 31 / 4 , 

(3 = 3(ti • A 2 -k V'3 + V’l) - ^2 • (3(^2 + 2fti) -k {ifl - 3^2 • V’ 4 ) • T’3 
-kft2 • {tl ■ (ft2 + ^ 3 ) + (T’3 - 3t2V’4) - ^ 2^/4 - 2ti) , 

where we have used the relation A2 = — (fio- These quantities are obtained 

with 7 multiplications in the superelliptic case and 9 multiplications in the case 
of Co , 4 curves. Then the leading coefficients of u' are given by 

u'o = u '2 = a — 2 u 5, u'l = P — uo ■ { 2 u 2 + U 5 ) — 2u4 

with 1 multiplication. 

83/93 



Finally, v' is computed as v' = fi ^A mod u with 20 multiplications and one 
inversion as described in Section 4.6. 



103/113 
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In the following table, we summarise the number of multiplications carried 
out by our algorithm for adding or doubling divisors, totalling the efforts for the 
composition and the reduction step. We distinguish between ordinary multipli- 
cations and squarings in the base field. While we did not pursue this distinction 
in the present article due to space restrictions, separating these two numbers is 
a simple exercise. The number of inversions is always 2. 





superelliptic 




C 3.4 






mult. 


sqr. 


m.-bs. 


mult. 


sqr. 


m.-bs. 


addition 


129 


11 


140 


139 


11 


150 


doubling 


143 


21 


164 


153 


21 


174 



6 Concluding Remarks 

Formulae for the arithmetic of hyperelliptic curves of genus 3 are reported in [18]. 
They require 76 field multiplications and one inversion for adding two distinct 
elements, and 71 multiplications and one inversion for doubling an element. 
While our formulae for superelliptic and C 3,4 curves need more operations, the 
factor of only about 2 shows that Cs ^4 curves constitute a reasonable alternative 
to hyperelliptic curves for cryptographic use. 

Availability of the Formulae 

The Magma code of our formulae can be downloaded from the web at the 
address 

http : //www. lix .polytechnique . f r/Labo/Andreas . Enge/C34 .html 



Acknowledgement. Thanks to Pierrick Gaudry for his comments on our work. 
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Abstract. The recent ideas of Agrawal, Kayal, and Saxena have pro- 
duced a milestone in the area of deterministic primality testing. Unfortu- 
nately, their method, as well as their successors are mainly of theoretical 
interest, as they are much too slow for practical applications. 

Via a totally different approach, Lukes et al. developed a test which is 
conjectured to prove the primality of N in time only 0((lg 
Their (plausible) conjecture concerns the distribntion of pseudosqnares. 
These are numbers which locally behave like perfect squares but are 
nevertheless not perfect squares. 

While squares are easy to deal with, this naturally gives rise to the 
question of whether the pseudosquares can be replaced by more general 
types of numbers. We have succeeded in extending the theory to the 
cubic case. To capture pseudocubes we rely on interesting properties of 
elements in the ring of Eisenstein integers and suitable applications of 
cubic residuacity. Surprisingly, the test itself is very simple as it can be 
formulated in the integers only. Moreover, the new theory appears to 
lead to an even more powerful primality testing algorithm than the one 
based on the pseudosquares. 



1 Introduction 

1.1 Motivation 

The aim of this paper is both theoretical and practical. It has been known 
since the 1930s [17] that the theory of so-called pseudosquares yields a very 
powerful machinery for primality testing of large integers N. In fact, assuming 
some reasonable heuristics [17] this gives a deterministic primality test in time 
0((lg which should be compared to the times 0((logA^)^°’®“''°^^^), resp. 

0((logAf)®+°(^)) required by AKS and successors, [1,15], or for their random time 
counterparts [2,5,6], which require time 0((logA^)^“*'°*-^^). 

The above heuristics have been confirmed for numbers to 2®°. Based on the 
investigations made so far, this makes the pseudosquare approach the most effi- 
cient primality testing algorithm for primes of about 100 binary digits, without 

* Research supported under APART [Austrian Programme for Advanced Research 
and Technology] by the Austrian Academy of Sciences. 

D.A. Buell (Ed.): ANTS 2004, LNCS 3076, pp. 102-116, 2004. 

© Springer- Verlag Berlin Heidelberg 2004 
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requiring any unproven assumptions. Specifically, for primes of this size, only 73 
modular exponentiations plus some trial divisions are required to prove primal- 
ity, see [4] . It should be noted here that this is similar to the amount of time that 
typically would be spent on any probabilistic method, like Miller-Rabin, which 
however unlikely, might return the wrong answer. 

Pseudosquares are integers which behave locally like squares modulo all 
primes in a certain range, while they are themselves not squares. A natural 
problem is to extend the theory of these pseudosquares to more general types 
of numbers. Numbers that ‘behave’ like an xth power without being a perfect 
a;th power are of interest here. However, even extending the theory to the case 
a; = 3 has proved to be challenging. It has not even been clear how such numbers 
are to be defined, nor how to investigate their theoretical behaviour, let alone 
their practical applications. Indeed, in the 1980s D. Lehmer posed a question 
tantamount to this to one of the authors (Williams), and it has remained a 
challenging open problem since. 

1.2 The Main Results 

In this note we succeed in giving a positive answer to the above in two respects. 
We establish the theoretical foundations and properties for pseudocubes and 
develop practical applications. The theory involves a number of new results by 
making use of cubic residuacity and cubic reciprocity. 

While the quadratic theory is very simple and well-understood, the extension 
of it’s important features to the cubic case is not easily apparent. Surprisingly, 
while the theory is more involved we have managed to describe the cubic coun- 
terpart in a clear and simple manner. Although the theoretical setting is the 
ring of the Eisenstein integers, the applications can easily be described and put 
into practice in the ring of integers only. Growth estimates of the pseudocubes 
suggest that the primality testing algorithms based on the new theory eventually 
will even become more efficient than those based on the pseudosquares. 

While the primality test that we develop is quite efficient for any given N in 
a certain range, it must be borne in mind that determining that range requires 
a lengthy precomputation - the process of determining (or bounding from be- 
low) Mx, which is likely not possible in polynomial time. Once this has been 
performed, however, we can apply our test of complexity 0((logiV)^) to any N 
in the range. 



2 Background and Sketch of the Problem 

2.1 Pseudosquares 

Definition 1. Letp be an odd prime. The pseudosquare Lp is defined as follows. 

1. Lp = 1 mod 8 

2. the Legendre symbol = 1 for all odd primes q < P, 
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3. Lp is the least positive nonsquare integer satisfying (1) and (2). 

The main result of the paper will be a result analogous to Theorem 1 (see 
[17]) in terms of pseudocubes. 

Theorem 1. If 

— N <Lp, 

— = ±1 mod N for all primes q, s.t. 2 < q < p, 

— = —1 mod N for at least one prime q, s.t. 2 < q < p, 

then N is a prime or a power of a prime. 

The above is a simplified and somewhat weaker version of the theorem in [17], 
which as an additional feature includes trial division. To extend this theory, we 
investigate the underlying properties. The difficult part is to establish that no 
composite integers N survive the test. Suppose that Pi, i > 2 are different primes 
dividing N . Then the conditions of the theorem imply the following, where V 2 {m) 
is the highest power of 2 dividing m, 



U 2 {P^ - 1) = n 2 {N - 1) for all P^\N, (2.1) 

if 1 then ^ ±1 mod P 1 P 2 . (2.2) 

Both of these lines immediately imply the above assertion, ^ ±1 mod 

N, which proves N does not fulfill Solovay-Strassen, and hence is not prime. 
The correctness of these two properties is established via Definition 1 through 
the use of quadratic reciprocity. We give a short summary to collect the main 
features necessary for cubic reciprocity. 

2.2 The Ring of Eisenstein Integers 

We first fix notation and recall some useful facts for which a general reference is 
[13] (see also [10,14]). Let w = = (—1 + \/^)/2, a primitive cube root of 

unity and let D denote the ring Z[w]. For any a = a+6w G D we shall denote the 
norm and the trace by N{a) = aa = of — ab + b^ and Tr{a) = a + a = 2a — b, 
respectively, where a denotes complex conjugation. 

D is a unique factorization domain, in which there are three types of primes: 
the inert rational primes q = 2 mod 3 of norm q^ , the primes tt of prime norm 
p = 7T7f = 1 mod 3, and the prime 1 — u> which lies over 3 as 3 = — w^(l — 

The units in D are the six roots of unity ±w* for i = 0, 1, 2. An element a G D is 
called primary, if a = 2 mod 3. With the exception of 1 — w (having no primary 
associate), every prime in D has exactly one primary associate. 

An easy way to find the decomposition of p = 1 mod 3 as p = tttt can be 
obtained from the quadratic partition of p (see [21]), resp. 4p. Indeed, any prime 
p = 1 mod 3 can be written as 

4p=l 2 + 27M^ L=lmod3, 
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with L and unique integers. From this, it follows that 



7T = 



L + 3M 
2 



-t“ SAluj. 



Conversely, to any tt = a + bto, primary, we associate the pair of integers (L, M) 
defined as L = 2a — b and M = 6/3. We note that the pair of integers associated 
in this way to 7 f is (L, —M). 



2.3 The Cubic Symbol and the Cubic Reciprocity Law 



For any a G D and any prime tt G D with N{tt) yf 3 the cubic residue character 
(^)g is defined as follows, 

~ if 7r|a then (^)g = 0, 

— if 7 r/a then is the unique element w* G satisfying 

- if a,P G D with 3|iV(/3) we define ^ = Hi=i where [3 = ]/[Li 

and all iTi G D are prime. Clearly, if tt and 7 are primary, then —Try is 
primary. If tt is primary, then tt = ± 7 Ti . . . tt^, where the tt^ are (not necessarily 
distinct) primary primes. 



We note the following properties. 



1. The symbol is bimultiplicative, that is 

(f)3 (v)^ 

2. Conjugation acts on the symbol as follows: (^)g = ( 7 ) 3 ^ = 

(l);=(i)3 

The Cubic Reciprocity Law (CRL) is often formulated in terms of primary 
primes in D. By the properties just stated, this then automatically yields the 
more general form, as follows. Let a and l3 be primary elements of D of coprime 
norm. Then 

©3= a- 

The Supplement to the CRL treats the case for w and 1 — w, which, not 
being primary, are not covered by the general CRL. Suppose that N{a) yf 3. If 
a = a + Sbto is primary (including the case 6=0) and a = 3m — 1 then 



1 — ui 
a 



and 



0 



, ,m+f) 

— UJ 



a/ 3 
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2.4 Extending the Quadratic to the Cubic Case: Problems 



If Pi and P 2 {Pi P 2 ) are prime divisors of N, then, since P 1 P 2 < N, by the 
definition of Lp, there is some prime q < p such that ^ ^ yf 1. By letting 

c = —1,2, or q, accordingly, quadratic reciprocity gives ^ which is 

(2.2), and we may assume this value is —1. Analogous results will be obtained 
when working in D, where the basis c will be replaced by a suitable a € D, 
and the Lp by its relevant cubic counterpart. For this section suppose we know 
N = vV and is any primary prime in D lying over Pi dividing N . In trying to 
extend the quadratic theory we encounter a number of difficulties, for example: 
Using an argument similar to the above (see [17]), it is easy to verify (2.1). 
This then immediately implies that if N is composite, it has to have at least 
three distinct prime divisors. The same is not true for the cubic analogue. 

Any odd integer N is only divisible by primes = 1 mod 2, but a priori it is 
not known if, when N = a mod 3, for cr G {1, —1}, that also each Pi = a mod 3. 

The theory of the pseudosquares gives us elements for which the value of the 
quadratic character (mod N or Pi respectively), is different from 1. However, 
we note that if in the CRL both a and /3 are rational then the assertion is 



trivially true and follows from the fact that =1 for any integer n and 

any q = —1 mod 3. Unfortunately this implies that in extending the theory we 
require complex elements in the ring of Eisenstein integers. 

Straightforward generalization of the definition of pseudosquares and of The- 
orem 1 into the setting of the Eisenstein integers would involve some chal- 
lenges. Firstly, there are several ways that pseudocubes may be defined as 
certain elements lying over rational primes. Unfortunately none of these ap- 
proaches seems to be practical. Secondly, N would need to be decomposed as 
JV = lyJy if JV = 1 mod 3. Similarly, each prime q < p which is 1 mod 3, would 
need to be decomposed as AA = q. Finally, the testing condition formulated 
utilizing the theory of the cubic symbol would lead to the testing conditions 
y(AT-i)/3 = mod V, and A^^ mod v. Below, we will avoid this 

quite involved setting and develop a practical approach which can be verified in 
the integers only. 



3 Pseudocubes 

3.1 Definition and Fundamental Properties 

Definition 2. For any non negative real number x, the pseudocube is defined 
as the smallest positive integer satisfying the following properties. 

1. Mx = ±1 mod 9, 

2. = 1 mod q for all primes q < x, q = f mod 3. 

3. Mx is relatively prime to all primes q < x. 

4-. Mx is not a cube of an integer. 
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Although condition (3) is not necessary in the definition, it is natural to have 
it included and the reasons for doing so will become apparent below. 

We also introduce the following notation: 

Definition 3. For any prime q we choose aq, a primary prime in D lying over 
q. So aq = q if q = —1 mod 3 and aqCtq = q if q = 1 mod 3. So there are two 
possibilities for aq in the latter case. Also, we denote by Xq the quotient aqjlXq. 

The quadratic reciprocity law allows us to give an alternative definition of 
the pseudosquares. Similarly, the CRL allows us to give the following alternative 
definition for the pseudocubes. 

Proposition 1. Mx is the smallest integer = ±1 mod 9 which satisfies 

1. = 1 primes q < x, yf 3. 

2. Mx is not a cube of an integer. 

s. ( k ;)3 = 1 - 

Proof. (1) Let g be a prime less or equal than x and N = Mx. If g = — 1 mod 3, 
then, by definition and CRL, (^)3 = (;^)3 = = 1. If g = 1 mod 3 then, 

by definition of a pseudocube we must have = 1 mod g, hence also 

modulo aq , so =1. The CRL gives then that (^)3 = 1, as desired. 

(2) is part of the definition of a pseudocube. 

The final assertion follows from the supplement to the CRL since Mx = 
±1 mod 9. □ 



3.2 Useful Lemmas 



If, a and j3 are complex integers in D then knowledge of {a/jd)^ reveals no 
information concerning {a/ jd)^. In this context the following lemma which gives 
a result for the product {a/ (d)z{a/ (d)z is useful, see also [12,8]. 



Lemma 1. Suppose a,jd are distinct primary complex elements in D. Then 

( a/a \ _ ( a \ 

V ^3 - 

Proof. 

As for primes p = —1 mod 3, N{p) = p^, the cubic symbol involves 

exponents {p^ — l)/3. A simplification involving only the exponent (p + l)/3 
is given in [8]. In fact, mod p. Also, from Lemma 1, the 

analogous result for p = 7T7f = 1 mod 3 reads mod tt. 

The last two lines motivate the following definition, which is analogous to 
the one given in the biquadratic setting [13,7]. The convenience of this notation 
will become more apparent later on. 
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Definition 4. For any integer N, 3/7V, we let N* = N if N = 1 mod 3, and 
N* = —N, when N = —1 mod 3. 

We note that 3|iV* - 1 and (NM)* = N*M*. 

We are now ready for the notion of an Eisenstein pseudoprime w.r.t. the base 
q. Recall the notation given in Definition 3. 

Definition 5. Let q be a prime not dividing N . We say that N is an Eisenstein 
pseudoprime w.r.t the base q if N is composite and 

(^\ = mod W 

VTV/3 ® 

We next show that the definition does not depend on the choice of the primary 
element aq. This only has to be verified when q=l mod 3. The other choice of 
a primary element would be o.q, so the properties of the symbol would lead to 
the inverse on the left hand side. But the corresponding \ would also lead to 
the inverse on the right hand side, and this establishes our claim. 

We summarize the above in the following which verifies that Definition 5 is 
really a notion of pseudoprime. 

Proposition 2. If N is a prime then it fulfills the condition in Definition 5 
w.r.t the base q for any prime q ^ N. 

Proof. Suppose = 1 mod 3. Let v he & primary prime lying over N. So N = 
vv. By Lemma 1, (^)g = = (^)3 = mod v. 

Since v was arbitrary the same equality holds modulo V, hence modulo N. 

Suppose N = —1 mod 3. Then it remains prime in D. So in this case {'^)^ = 
c^(iv^-i)/3 = mod N. □ 

Remark 1. Note that if g is a prime —1 mod 3 then any composite N relatively 
prime to g is a Eisenstein pseudoprime w.r.t. the base g. This is because on one 
hand = {-^)^ = and on the other hand, Xq = 1, so the equality trivially 

holds. 

3.3 The Main Theorem 

Theorem 2. Let N be odd, N < N not a prime or a prime power. 

Then, there is a prime q < x, q = 1 mod 3 such that N is not an Eisenstein 
pseudoprime w.r.t. the base q. 

The proof of the theorem will be based on the lemma below. We first intro- 
duce some notation that will facilitate the statement and proof of the lemma. 

- We will say that N satisfies Hypothesis (*) if N is odd, not a prime power, 
and 

= X^^ mod N holds for all prime q < x,q ^ 3. 
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- Let — 1) = s (that is, the exact power of 3 dividing N* — 1). Let 

In* = {N* — l)/3^. Note that In* is an integer and if In*- Let P be any 
prime divisor of N and denote tp* = {P* — l)/3®. 

Lemma 2. Suppose N satisfies hypothesis (*), N < M^, and N is not a cube 
of an integer. Then, 

1. v^{P* - 1 ) = v^{N* - 1 ) 

2. for all primes q, q ^ i, q < x we have: 



/aqpp* 




KNJs 


\p) 



Proof. We first claim that vz{P* — 1) > i'z{N* — 1). If s = v^{N* — 1) = 1 then 
the inequality holds trivially so we assume that s is at least 2, so N* = 1 mod 9. 
Since N < M^, then by Proposition 1 there is a prime q < x, such that (^)3 yf 1- 

Since N satisfies hypothesis (*) then mod N, hence mod tt, 

for any prime tt in D dividing N. It follows that the element ” has order 

3^* mod TT, so 3® divides N{tt) — 1. In both cases, P = ±I mod 3, this leads to 3® 
divides P* — 1, which proves our first claim. It follows that tp* is an integer. 
We will prove part 2) of the lemma and then use that to prove part 1). 
Consider the following two equalities, which hold for all primes q less or equal 
to X and different from 3: 

= Af mod N (3.1) 

(which is hypothesis (*)), and 

( P )a " ^ >”'1 P 

(which holds by Proposition 2 because P is prime). 

Since P divides N then both equalities hold mod P. Raising the first equality 
to the tp* and the second to the we obtain 



Since P yf 3 then this leads to the equality, which is part 2) of the lemma. 

To prove 1) it suffices to show that tp* is not a multiple of 3. If P* ^ 1 mod 9 
then v^{P* — 1) = 1, and since we proved that r's(P* — 1) > — 1) it follows 

that s = v^{N* — 1) = 1, so tp. = (P* — l)/3 ^ 0 mod 3, since P* ^ 1 mod 9. 

Alternatively, if P* = 1 mod 9, then since P is a divisor of N then P is also 
< Mj;. By Proposition 1, there must be a prime q < x {q ^ 3) such that (^)3 is 
not 1. Since tAP is not 1 mod 3, then, for this particular Ug the right hand side 
of the equality in 2) is not 1, hence the left side cannot be 1, which shows tp» 
cannot be a multiple of 3. □ 
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Proof of Theorem 2. Under the hypothesis in the theorem, N must have more 
than one prime divisor. Let P and Q be two different prime divisors of N. We 
assume that N satisfies hypothesis (*). Applying the lemma to both P and Q 
we obtain: 

ViV/3 “VP/3 



/^y<3* ^ 

\nJ3 KqJs 

Using the fact that (tp)^ = = (/g)^ = 1 mod 3, one can deduce from the 

two equalities that 




holds for all primes q less or equal to x. 

It follows that for z = 1 or 2 we get 

\P^Q)^~\p)3\QJs~^Ph 

This last exponent is = 0 mod 3 for i = —tp»tQ* mod 3, which in particular 
implies {P^Q)* = 1 mod 9. Also, 

= 1 for all g up to a: for z = 1 or 2. (3.5) 

3 

But P*Q < P^Q < PN < < Mx, which gives a contradiction to the 

minimality of the pseudocube M^- It follows that hypothesis (*) fails to be true 
for some g up to a; and, finally, by Remark 1 this has to be the case for some 
— I mod 3. □ 

4 Practical Considerations and Estimates on Psendocnbes 

4.1 A More Practical Version 

Recall that while Theorem 1 tests for the Solovay-Strassen condition on the 
left hand side, it does not require computing the quadratic symbol on the right 
hand side. Instead, it is required that the —1 appears on the right hand side at 
least once. Unfortunately this requirement on the existence of a basis which is a 
quadratic nonresidue makes the algorithm a random time deterministic test. In 
fact, this requirement on the —1 makes the test essentially a Solovay-Strassen 
test. It is known [II, Exercise 3.24], that if N is odd and if there exists an integer 
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b with = —1 mod N, then any integer a with = ±1 mod N 

also satisfies = (f) ^■ 

On the other hand, since any witness to the Solovay-Strassen test (or to 
the Euler pseudoprime test) is also a witness to the Miller-Rabin test (or the 
strong pseudoprime test), then the symbol need not be computed. So to do 
something equally effective in our case, namely, to avoid the computation of the 
cubic symbol we will first give the notion of a strong cubic pseudoprime. 

Definition 6. Let q, ag, Xg, N* , and be as above. We say that N is a strong 
cubic pseudoprime w.r.t. base q if N is composite and either 

A*'^* = 1 mod N, 

or there is some i, 0 < i < s — 1, such that 

A*'^*3 = u)^ mod N, for j = 1 or —1. 

The following result is an extension of a well known theorem of Selfridge [20] 
to the cubic case. 

Theorem 3. If N is a strong cubic pseudoprime w.r.t. the base q, then N is an 
Eisenstein pseudoprime w.r.t. q. 

Proof. Let A*'^* = 1 mod N. Because 3|tAr*, this implies p 

for any P\N. So = 1 for tt lying over P. Now 

hence (^)3 = 1, as desired. 

Now assume A*"*^ = oj^ (mod N) for some i and j, 0 < i < s— 1, j ^ 0 mod 
3. If /ip is the least positive integer such that Xj^^ = 1 mod P for any prime P\N, 
then /ip|3*+^tAf, but /ip| 3Tat* . But /tp divides P* — l by Proposition 2, so that 
P* = 1 + for some integer kp. 

Note that if is a product of (not necessarily different) primes P, then 

kN- = ^ /jp mod 3. (4.1) 

P|AT 

This follows since N* = Opiat which gives l+3®tAP = l+3*+3 J2p\n 
3^*+^, and hence, = J2p\n^p 3 *+^. 

Again, by Proposition 2, = Xg^ kptff hypothesis, 

AqS mod N. From these two equations it follows that 




Multiplying over all primes P dividing N we obtain (^) 3 ^* = = 

tiv* ^ where the last step follows from (4.1). Finally, since fjg, = 1 mod 3, 
we get i^)^= mod N. □ 
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As a corollary of Theorem 2 and Theorem 3 we obtain the following: 

Theorem 4. Let N he as in Theorem 2. Then there is a prime q, q= 1 mod 3, 
q < X, such that N is not a strong cubic pseudoprime w.r.t. the base q. 

For practical purposes in determining the primality of a certain N < 
the use of Theorem 4 is faster, since it requires no calculation of the cubic 
residuacity symbol. We will show that the cost of a round of the strong cubic 
pseudoprime test is practically about the same as the cost of one round of the 
classic strong pseudoprime test. 

4.2 Evaluation 

For any prime q = 1 mod 5, q < p, the primary primes aq can easily be found 
via Cornaccia-Smith (see e.g. [9,11]) or by trial, since q is small. This gives 
aq = a + bu>, from where we get Xq = aqfcxq mod N. The evaluation of powers 
oi Xq = c + du modulo N according to Definition 6 can done very efficiently in 
the integers only, as shown in the following lemma. The result is an element in 
D, reduced modulo N, and one only needs to verify whether this is l,w, or 
as desired. 

Lemma 3. Given Xq = c + duj, the evaluation of A™ mod N can he achieved in 
log(m) squarings and log(m) multiplications, modulo N. 

Proof. We define the integers Tj = Tj{q) as Tr{X^q) = A^+A^ = A^+l/A^ mod N. 
They satisfy the recurrence relations 

T2j = T/ - 2, T2i+1 = TqTq+, - Ti (4.2) 

with the initial values Tq = 2,T\ = Xq + l/A^modiV. Thus, if we have 
Tj mod N and mod N, then we may compute either the pair T 2 j mod N, 
T 2 j+i mod N , or the pair T 2 j+i mod N, T 2 j +2 mod N , with each choice taking 
one multiplication modulo N , and one squaring modulo N . This way, starting 
from Tq,Ti we can recursively arrive at any pair Tk,Tk+i- Which type of the 
two moves has to be made can be read off from the binary representation of m 
in the target pair m, m + 1, see e.g. [11]. 

Finally, the desired value of A™ mod N can be obtained via the following: 

. (T^,T„+i) = (2,Ti) mod N,iS A™ = A™ = 1 mod N, 

• {Tjn,Tm+i) = (—1, T) mod N, where 2T = —Ti ± 3d mod N, 
iff A™ = uj, up' mod A. 

This can be seen as follows. We have (A — A) (A™ — A™) = 2Tm+i — TiTm 
where, for brevity we write A for Xq, and A for Xq. 

If (T^, T„+i) = (_2, Ti) mod N, then (Tf - 4)(A™ - A™) /(A - A) = 0 mod N, 
which implies A™ = A mod N, since we may assume gcd(A, Tf — 4) = 1. Since 
Tm = 2 mod N, this gives A™ = A =1 mod N, as required. The converse is 
immediate. 
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If (Tm,Tm+i) = (— 1,T) mod N then A^+A™ = —1 mod N and (A— A)(A’” — 
A™) = 2T^+i_- TiT„ = 2T + Ti = ±3d mod N. Now, 3d = (w - = 

{oj — w^)(A — A) mod N, and therefore A™ — A"^ = ±{lo — mod N. Hence, 
2A™ = — l±(w — mod N, as desired. The converse follows again immediately. 

□ 

Remark 2. In many cases the evaluation can be somewhat simplified via the 
following. Suppose TV = 1 mod 3 and we attempt to compute the square root 
of —3 mod TV, under the assumption that TV is prime. If any such method fails 
to return the correct answer, TV is disclosed as composite. Otherwise, we obtain 
the value of w mod TV as (—1 — V— 3) /2 mod TV. Now, exponentiation of any 
\q = c + dw mod TV is simply the exponentiation of a rational integer modulo TV. 
There are several ways for computing the square root of —3 modulo a prime TV. 
The simplest requiring one, resp. two exponentiations, when TV = 3 mod 4, resp. 
TV = 5 mod 8. In the other cases, a quadratic nonresidue modulo TV is required, 
and in practice this is always easily found, see e.g. [3]. 

4.3 Growth-Rate Estimates 

The natural question that arises is whether there is any advantage to using 
pseudocubes instead of pseudosquares for testing primality. It is a difficult ques- 
tion to answer, mainly because the actual rate of growth of pseudosquares or 
pseudocubes is not known. In this section we will try to make such a compari- 
son by using heuristic estimates of the size of both the pseudosquares and the 
pseudocubes. 

Let j>i denote the prime, and qi denote the prime = 1 mod 3. Recall 
that the test for the pseudosquares involves all primes < p, where p is the 
smallest prime p„ such that TV < . The cubic test only requires the primes 

q = I mod 3, q < p which are about half of all the primes. On the other hand, 
the p in this case is the smallest prime such that TV < Mql^ . Since the 
running time for each round requires about the same time for both of the tests, 
we conclude that the test based on the pseudocubes will be ‘better’ than the 
test based on the pseudosquares, only if 

(4.3) 

In [17] it is conjectured that the pseudosquares Lp„ should have a growth 
rate of the form 



Lp„ « Cl 2” log p„ (4.4) 

which implies a growth rate for Lp of the form This estimate 

was derived from the conjecture that the solutions of 

a; = 1 mod 8, ( — ^=1 (i = 1, 2, ..., n) (4.5) 



are equidistributed in the region 0 < a; < 8 p 2 P 3 ---Pn- 
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Making the analogous heuristic assumption that the solutions of 

X = ±1 mod 18, = 1 mod (z = 1, 2, n) (4.6) 

are equidistributed in the region 0 < x < 18qi...qn we would expect that 

n 

~ 9 <li/ S, 

i=l 

where S is the number of solutions of (4.6). Since 

n 

S = l[{qi-l)/3, 

2 = 1 



we get 

n 

2 = 1 

Now, Mertens’ Theorem for arithmetic progressions states that 
n ("i--] «c(iogx)-i/^('=), 

Pi<x ,pi=lmodk ^ Pi/ 

where the constant c and the error term can be made explicit, see [22]. From 
this, we get 



« C23”(logg„)2 (4.7) 

for a constant C 2 . To compare their relative growth rates, we get from (4.4) and 
(4.7), 

Lp„ ^ ci2"(logp„) 2 J 

for n sufficiently large. This suggests the following. 

Conjecture. Under the same heuristic estimates (4.5) and (4.6) for the pseu- 
dosquares and the pseudocubes we have > Lp^ for sufficiently large n. 

Remark 3. In [18] character sum techniques have been used to make rigorous 
some of the required statements about the distribution of numbers, as in (4.5). 
Schinzel showed that, under the ERH, for every e > 0 and for sufficiently 
large, (1 — e)^/p < log Lp^ < (2 log 2 + e)p„/logp„. He also obtained a much 
weaker result unconditionally. Unfortunately, in spite of using the powerful ERH, 
his results are far from being as precise as the conjectured (4.4). 
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4.4 Some Data 

A complete list of pseudocubes Mg^ up to qk = 313 can be found in [19]. Further 
numerical computations were obtained in [16], which we give below, as they have 
never been published. 



Qk 




331 

337 

349 

367 


75 017 625 272 879 381 
75 017 625 272 879 381 
75 017 625 272 879 381 
> 100 000 000 000 000 000 



Thus, on comparing these data with the table in [17], we see that the above 
heuristic that the pseudocubes will eventually outgrow the pseudosquares is 
supported by the existing numerical evidence. However, we have yet to reach 
the point where we get Mq[^ > Lp^ . Further computations of Mg^ may provide 
more information concerning this. 
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Abstract. We present a non-archimedean method to construct, given 
an integer V > 1, a hnite field and an elliptic curve E/Vq such that 
E(Fq) has order N. 



1 Introduction 

A classical theorem of Basse from 1934 states that for an elliptic curve E defined 
over the finite field Fq of q elements, the order of the group E{Fg) of F^-rational 
points is an integer in the Hasse interval 

T~Lq = [g + 1 — g + 1 + 2i^/g] 

around q. If E is given in some standard way, say by a Weierstrass equation 
over Fq, there are several algorithms that compute the order of E{Fq). The 
1985 algorithm by Schoof [8,9] runs in time polynomial in logq, and in small 
characteristic p there are even faster p-adic algorithms due to Satoh [7] and 
Kedlaya [5]. 

The situation is rather different in the case of the following problem, which 
can be seen as an ‘inverse problem’ to the point counting problem. 

Problem. Given an integer N > 1, End a Unite Geld F^ and an elliptic curve 
E/Fq for which the number of F q-rational points equals N. 

As with other inverse problems, such as in Galois theory, this is mathematically 
a natural question to ask. In this particular case, an efficient solution to the 
problem would also be desirable in view of the need in current applications 
to construct elliptic curves having point groups satisfying various smoothness 
requirements with respect to their order. It is one of the reasons why we focus 
on the order N, and do not specify the finite field F^ as being part of the input. 
In addition, we will use the freedom with respect to the choice of a base field Fg 
to our advantage. 

A necessary condition for our problem to be solvable for given N is clearly 
that N is contained in some Hasse interval Hq, so we would like the union [J^ 'Hq 
over all prime powers q to contain all positive integers. It is easy to see that the 
contribution to the union coming from the ‘true’ prime powers q that are not 
primes is negligible: it is contained in a zero density subset of Z>i. For this 
reason, we may and will restrict in the sequel to the case where the base field F,^ 
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is the prime field coming from a prime number q. In this particular case, all 
integers in Hq actually do occur as the group order of E{Fq) for some elliptic 
curve E, so N G TLq is sufficient to guarantee the existence of a solution. (For 
arbitrary prime powers q there are often not enough supersingular curves to 
realize all orders congruent to 1 modulo the characteristic.) 

For the equality Z>i = prime ^9 need to show that the primes are 
not too far apart, i.e., that the gap between consecutive primes q and q' is 
roughly bounded by 4,y/q for large q. This is more than what is currently known 
to be true: even under assumption of the Riemann hypothesis the gap between 
consecutive primes can only be shown to be of order 0{y/q{logq)^). However, 
from a practical, algorithmic point of view there are always lots of primes q 
for which a large integer N is contained in 'Hq. Indeed, by the prime number 
theorem, we expect I out of every log N integers around N to be prime, so for 

large N the set of primes q having N € Hq is on average of size A'/N / log N, 

and finding such q is never a problem in practice. 

Once we have found a prime q > 3 for which we have N G Hq (we now 

require > 1), there is the following naive algorithm to find an elliptic curve 

having exactly N rational points over Fg. Suppose that we are not in the easy 
cases where we have N = q + 1 (then any supersingular curve over Fg will do) 
or where one of the few curves with j-invariant 0 or 1728 has the right number 
of points. Then we try 

Ea'.y^ = x^ + ax-a with j{Ea) = 

for random a G F*\{— as the Weierstrass equation of the desired curve until 
we find a curve having N points. More precisely, we write N = q + 1 — t and 
check whether for our a the point (1,1) G Ea(Fg) is annihilated by N = q+l — t 
or q+1 + t. If it is, we check whether the number of Fg-rational points is indeed 
q+liLt. For order TV = q+l — t we are done, for order q+l+t not E^ itself but its 
quadratic twist has N points. Even though the distribution of the group orders 
of elliptic curves over Fg is not quite uniform, we expect to examine O(ytq) = 
0(y/jV) curves Ea before we hit a curve having exactly N points. As the amount 
of time spent per a is usually very small, and certainly polynomial in log iV, this 
yields a probabilistic algorithm with expected running time 0 (iV 2 +°d)). Jt jg 
quite practical for small values of JV, but becomes unwieldy for JV ^ 10^®. 

In the next section we briefly describe a classical deterministic algorithm 
based on complex multiplication methods which, although not asymptotically 
faster than the naive algorithm, can be improved in various ways. Our first 
improvement is a p-adic approach to complex multiplication based on the re- 
cent work of Couveignes and Henocq [3]. It is described in section 3, and illus- 
trated by the explicit computation in section 4 of an ‘ANTS 6 curve’ having 
2004061320040618 rational points. Our second improvement, in section 5, con- 
sists of using ‘small’ modular functions in this p-adic context to push the limits 
of what is feasible by p-adic methods. Although the resulting algorithm is still 
far from polynomial, its power is illustrated in section 6 by the computation of 
an elliptic curve having lO^'^ rational points. 
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2 Complex Multiplication 

A deterministic algorithm to produce an elliptic curve E over the prime field Fg 
having N G T-Lq points is provided by the theory of complex multiplication. One 
writes N = q+l — t and observes that the Frobenius endomorphism Fq : E ^ E 
on the desired curve E/Fq satisfies the quadratic relation Fq — tFq + g = 0 of 
discriminant Z\ = — 4<7 < 0 in End(A) = Endp {E). Assume we are not in the 

supersingular case < = 0. Then our observation gives rise to an embedding 

End(E) — > K= Q(yZ) 

Fq I ^ 7Tg = (t + \fA)l2 

that maps Fq to a prime element Hq of trace t and norm q in the quadratic order 
Oa = Z[(Z\ + -\/A)/2] C at of discriminant A. By the Deuring lifting theorem 
[6, Chapter 13, Section 5], there exist a number field El K and an elliptic 
curve E/Fl such that 

1. there exists 4>q G End(E) satisfying (jig — t4>q + g = 0; 

2. the prime q splits completely in F[/Q, and for every prime q|g in iJ the 
reduced curve E mod q is an elliptic curve having q+l — t points. 

In fact, the reduction of the endomorphism 4>q G End(E) above modulo a prime 
q|g in FI yields the Frobenius endomorphism of the curve E mod q over Fg. 

The smallest field Ef D K over which a curve E satisfying 1 and 2 can 
be defined is the Hilbert class field of K. More explicitly, from a list of reduced 
binary quadratic forms [a, 6, c] of discriminant D = disc(AT), which can be viewed 
as a list of elements of the class group Cl{D) of discriminant D, we can form the 
class polynomial 




of discriminant D. Here j : H — >■ C denotes the well-known elliptic modular 
function on the complex upper half plane. Using sufficiently accurate complex 
approximations of the zeroes j(— ^ 2 a^) of Fij, one may exactly determine Fd 
as it has integral coefficients. Once we have Ed, we are essentially done. Indeed, 
any zero of the irreducible polynomial Fo G Z[A] generates the Hilbert class 
field H oi K over AT, and modulo q the polynomial Fo G Fg[A] splits into linear 
factors. (In fact, this property of Fq is an excellent check for the correctness of 
any algorithm to compute Ed.) The zeroes of Ed in Fg are the j-invariants of 
the elliptic curves over Fg having endomorphism ring isomorphic to the ring of 
integers Od of AT. If j G Fg is one of these zeroes, we write down a curve with 
this j-invariant, and check whether it has q + l — t points. If it hasn’t, we have 
found a curve with q+l + t points, and (for j yf 0, 1728) its quadratic twist has 
N = q + 1 — t points. 

This deterministic algorithm, although relatively simple, is not much faster 
than the naive algorithm, as for large D the class polynomial Ed is of degree 
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h{D) = and has coefficients of size making the total 

running time of order However, if we have the freedom to pick q on 

input N , it is often possible to find primes q for which the associated discriminant 
A = A{q) = {q—1 — — 47V is not only of size TV (this is simply done by 

picking q G [TV + 1 — 2-\/TV, TV + 1 + 2\/TV] close to the end points of the interval) , 
but moreover has a large square factor. This leads to a field discriminant D of K 
which is quite a bit smaller than A, and makes the method feasible in cases 
where the naive method would take too long. As we currently cannot even prove 
the existence of a single prime in q G [TV + 1 — 2-\/TV, TV + 1 + 2-\/TV], we certainly 
cannot prove the existence of q for which A(q) has large square factors and D 
is of order substantially smaller than 0 (TV 2 +°(i)). 

3 A Non- Archimedean Approach 

The key feature of the complex multiplication method in the previous section 
is the computation of the class polynomial Fjj G Z[Xj of the order Ojj for 
suitable D. As the zeroes modulo q of Fjj are j-invariants of curves EjYq for 
which either E or its quadratic twist has exactly N points, this immediately 
solves our problem. In this section we achieve the computation of Fd in an 
other way, using p-adic instead of complex approximations of the zeroes of Fd- 
Working in a non-archimedean setting has the advantage that we no longer have 
to cope with the problem of rounding errors that arises in the complex case. It 
does require a p-adic substitute for the complex analytic method to evaluate j 
in CM-points of H using Fourier expansions, and this is provided by the recent 
work of Couveignes and Henocq [3] explained in this section. 

Let N = q+l — t and Z\ = — 4g be as in the previous section, and D < —4 

the discriminant of Q(-\/A). We first construct an elliptic curve E over a finite 
field Fp which has CM with Od- As we want p to be as small as possible, we let s 
be the smallest positive integer of the same parity as D for which p = — D)/ A 

is prime. For D = 1 mod 8 such p does not exist for parity reasons unless D = —7, 
and we pick the smallest positive s for which p = (s^ — AD) /A is prime instead. 
In practice we expect s to be small, at most a power of log |I?|, so that p is of the 
same order of magnitude as D. Unfortunately, even under GRH proven upper 
bounds [3] for s are much weaker. 

As there is a prime element Tip = {s + 'J~D)j2 (or 7Tp = (s + 2^/~D)l2) of 
norm p and trace s > 0 in the order Od = Z[7Tp] (or O^d = Z[7Tp]), there exists 
an ordinary elliptic curve over Fp having CM by Z[7Tp] and p + 1 ± s points 
over Fp. We can find such a curve E jFp by applying the naive algorithm, as we 
saw in section 2 that \D\ is much smaller than TV. We have End(£T) = Od for 
D ^ 1 mod 8. For D =1 mod 8 the ring End(A) is either equal to Z[7Tp] = O^d 
or to 13 Z[7Tp]. We are in the second case if all 2-torsion of E is Fp-rational, 
and in the first if it isn’t. As for Z? = 1 mod 8 the class polynomials Ed and F^d 
have the same degree and both generate the Hilbert class field F[ of K, they are 
both fine for our purposes. Alternatively, in case we have End(£T) = O 4 D there 
is a unique point P of order 2 in E{Fp), and we can replace E by the 2-isogenous 
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curve E/{P) to achieve End(i?) = Od- We assume that we have End(E) = Od 
in the sequel. Note that D is by definition fundamental. 

By the Deuring lifting theorem, there exists a curve E/H and a prime p|p 
such that E reduces modulo p to if and such that we have End (if) = End (if). 
As p splits completely in i?/Q the curve E is actually defined over Qp. In fact, 
E is the unique elliptic curve over Qp with reduction E and endomorphism 
ring End(if) = End(if) = Od- It is this canonical lift if of if that we want to 
compute, as its j-invariant is a zero of our class polynomial. 

Let Ell£i(Fp) be the set of Fp-isomorphism classes of elliptic curves over Fp 
with endomorphism ring Od- The j-invariants of the elements in Ell£)(Fp) are 
the zeroes of {Ed modp), so Ell£)(Fp) is finite of order h{D) = ffCl{D). It can 
be identified with the similarly defined set E11d(Qp) of Qp-isomorphism classes 
of elliptic curves over Qp with endomorphism ring Od- The j-invariants of the 
elements in Ell^i (Qp) are the zeroes of Ed- Let Cp be the completion of an 
algebraic closure of Qp, and write Xd{Cp) for the set of isomorphism classes of 
elliptic curves over Cp with the property that their reduction is an element of 
Ell£)(Fp). Then Xd{Cp) is a p-adic analytic space, as the j-invariant identifies 
it with a subset of Cp. It consists of h{D) discs of p-adic radius I. Each disc 
contains exactly one element of Ell£)(Qp), and this is the subset of Cp we want 
to compute. It consists of the (j-invariants of the) isomorphism classes of elliptic 
curves in Xd{Cp) having CM with Od- 

Let / C Od be an OD-ideal prime to p and E/Fp an elliptic curve in 
E11 d(Fp). Then there is a separable isogeny E ^ Ej which has the subgroup 
if[/] of /-torsion points of E as its kernel. In this way, we obtain a bijection 
Pi : E11d(Fp) — >• E11d(Fp) that sends the isomorphism class of E to that of Ed 
We obtain an action of the group /(p) of O/c-ideals prime to p on E11d(Fp), and 
since principal Oic-ideals act trivially, this action factors via the quotient map 
/(p) ^ Cl(/?). This makes E11 d(Fp) into a principal homogeneous Cl(/?)-space. 

The fundamental idea in [3] is that the action of /(p) on E11d(Fp) admits 
a natural lift to an action of /(p) on Xd{Cp). More precisely, for I £ I (p) the 
map Pi : A^(Cp) Xd{Cp) is a p-adic analytic map that lifts pi in the sense 
that on E11d(Qp) C Ad(Cp), the restriction of pi is the standard Galois action 
that factors via C1(D). If / = (a) is principal with a G Od \ Z, then pa = pi 
stabilizes each disk around a CM-point in E11d(Qp), and has the CM-point in 
the disk as its unique fixed point. 

It is shown in [3] that the derivative of the map pa in a point of E11d(Qp) 
equals a /a, and this can be used to compute the j-invariant of a CM-point in 
E11d(Qp) starting from an arbitrary point in the disk using a Newton iteration 
process. If Ei is any lift of if to Cp, we put 

( 1 ) j{Ek+i) = j{Ek) - j{Ek) k£Z>D 

[a/a) — 1 “ 

If a is small with respect to p, then (a/d) — 1 is a unit in Zp and the sequence 
(1) converges to the j-invariant of the canonical lift E. In each step, the p-adic 
precision of the approximation is doubled. 
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For the definition of pi on the isomorphism class of E' in Xu{Cp), we note 
that the subgroup E[I] of /-torsion points of the reduction EjFp of E' lifts 
canonically to a subgroup E'[I] of E' , and we put pi(E') = Ej, with E ^ E'j 
the isogeny with kernel E'[I], This provides a lift of E[I] to a group scheme over 
the p-adic disk in Xo{Cp) lying over [E] G Ell£)(Fp). More algorithmically, if 
E is given by a Weierstrass model, the subgroup E[I] can be described by a 
separable polynomial // € Fp[Jf] having the x-coordinates of the affine points 
in E[I] as its zeroes. If / has norm n, then p \ n and fj divides the n-th division 
polynomial of E in Fp[X]. Choosing a Weierstrass model for E' reducing to that 
for E, we can lift // uniquely by Hensel’s lemma to a factor // of the n-th division 
polynomial of E' , and this factor describes the subgroup E'[I]. Note that pi is 
the identity on Xu{Cp) if I is generated by an integer, and that, consequently. 
Pi is the inverse of pj. 

For the explicit computation of Pa(E'), we take an element a = a -I- biTp in 
Od \ Z that is sufficiently smooth, i.e., a product of Oc-ideals L = (£,a + bnp) 
of small prime norm t ^ p. Such an element is found by sieving in the set 

{a + biTp : a, & G Z, 6 yf 0, (a, b) = 1}. 

Again, the smoothness properties are in practice much better than what can be 
rigorously proved [3]. For the computation of pl(E'), we first compute the action 
of pl on the reduction E = E' G EIId(Fp), i.e., the j-invariant of Ei = E'p. 
The kernel E\L] of the isogeny E -G Ei is a cyclic subgroup of order £ of E\£] 
that is an eigenspace of the Frobenius morphism with eigenvalue —b/a G Fg. 
The corresponding polynomial fi G Fp[A] can be computed by the techniques 
used by Atkin and Elkies to improve Schoof’s original point counting algorithm, 
see [9]. These techniques also yield a Weierstrass model for Ei; we only need 
the j-invariant j{Ei) G Fp of Ei = pi{E). From the decomposition of pa into 
‘prime degree’ maps pi, we obtain a cycle of isogenies 

(2) E ^ El, ^ Ei,i, ^ . . . ^ = E. 

To compute the action of L on the lift E' of E, we do not lift fi to some precision 
to a divisor of the £-th division polynomial of E' , and then find a Weierstrass 
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model and the j-invariant for using Vein’s formulas [11]. As the division 
polynomial has degree — 1)/2 for £ > 2, this would be rather time-consuming. 
Instead, we exploit the £-th modular polynomial G Z[A, V], which is 

of degree £ -|- 1 in each of the variables, and which we can precompute for a 
number of small primes £. As there is an isogeny E' — of degree £, we know 
that j{E'j^) is a zero of <Pi{j{E'),Y) G Zp[V]. Since we know the j-invariant of 
the reduction of we also know which root to approximate in Zp, and this 
reduces the lifting process to a simple Hensel lift of a zero of a polynomial of 
degree £+\. 

For our Newton process in (1), we start from E G Ell£)(Fp), and compute 
the cycle of Fp-isogenies in (2). We then lift E arbitrarily to a curve Ei over Qp, 
the ‘1 digit precision’ approximation of the canonical lift. Now we compute lifts 
over Qp of our Fp-isogenies in 2 digit precision, using the modular polynomials, 
and use the value of Pa{Ei) obtained to update Ei as in (1) to a 2 digit preci- 
sion approximation E 2 of the canonical lift. We continue this process of making 
(not really closed) cycles over Qp, doubling the precision of the computation at 
each step, until we have the canonical lift with high enough accuracy (see [2, 
Chapter 7, Section 6] for an estimate of the required accuracy) . 

If we know the canonical lift E, we compute its Galois conjugates again via 
the modular polynomials. For this, we need small primes L that generate the 
class group. Under GRH, this can be done using primes L not exceeding the 
Bach bound 61og^(|Z?|), see [2, Chapter 5, Section 5]. In practice, this is never 
a problem. If we have all the conjugates to the required precision, we find by 
simple expansion of the product below the class polynomial 

Fd= n {X-j{Ei))gZ[X]. 

[I]&C\(D) 

4 An Elliptic Curve for ANTS 6 

We illustrate the working of our p-adic method by computing a tailor made 
elliptic curve for ANTS 6 having exactly N = 2004061320040618 points. 

First we look for a small discriminant, so we write X = q + 1 — t for various 
primes q and search for a large square dividing A = — 4q. In this example, the 

choice 

N = 2004061230508291 -b 1 -b 89532326 

yields Z\ = -2^ • 3 • 619^ • 22567, so we take £> = -2^ • 3 • 22567 = -541608 in 
this section. The corresponding class group C1(D) has order 132. 

Our goal is to compute the class polynomial Ed- We will do this p-adically, so 
we first find an elliptic curve over some Fp which has CM with O d ■ The smallest 
integer s > 0 for which (s^ — D)/A is prime is s = 2, so we have D = 2^ — Ap 
with p = 135403. We fix this value of p for the rest of this section. 

We now apply the naive method to find a curve Ea'- + ax — a over 

Fp with trace of Frobenius 2. We find that En^-j has trace 2 and take this as 
our base curve EjFp. 
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Next we determine which element a G \ Z we will use for the map pa- 
The ideal (a) = (22539 + 47Tp) of norm 510353281 = 19^ • 29^ • 41^ factors as 

(a) = Lfg ■ Lig ■ Li, = (19, TTp + 6)2 • (29, TTp - 13)2 • (41, Wp - 13)2. 

We compute the action for the prime ideal Lig. The eigenvalue of the action of 
Frobenius on the 19-torsion is —6 € Fig. If we evaluate the modular polynomial 
^ig(X, F) in X = j(E) = 41556 € Fp, we get a polynomial which has two roots 
over Fp, namely 19533 and 54827. From this we deduce that L,g sends j(E) to 
one of these two roots; we don’t know which one yet. 

We just guess that the correct j-invariant is 54827 G Fp. Following Elkies 
[9], we now compute the eigenspace S of the 19-torsion corresponding to this 
isogeny. We get the x-coordinates of the points on if in S' as zeroes of 

-H29873X® -k 49874X^ -k 131130X6 -k 49222X6 -k 46538X4-k 
111513X6 -k 68602X2 126444X -k 20947 G Fp[X] 

Since we know that the eigenvalue for L,g is —6, we can now just check whether 

(XP,yP) = -6- (X, Y) 

holds for points in S, i.e., we compute both (X^, Y^) and —6 • (X, Y) in the ring 

Fp[X,F]/(/L,,(X), f2 - x6 + 1737X - 1737). 

Note that the • means adding on the curve! In this example, it turns out that 
(X^, YP) and —6 - (X, Y) are not the same. It follows that the correct j-invariant 
of the Lig-isogenous curve is the other value 19533 G Fp. 

The action of Lig on the curve with j-invariant 19533 is now easier to com- 
pute: the modular polynomial has again two roots, but one of the roots has to 
be j{E). This root corresponds to the action of L,g, so we pick the other root. 
If we compute the entire cycle corresponding to L, we get: 

41556 ^ 19533 ^ 100121 ^ 86491 ^ 40349 ^ 32517 ^ 41556. 

We now lift E /Fp to E, / Qp by lifting the coefficients of the Weierstrass equation 
arbitrarily. The polynomial <Pig{j{Ei),Y) G Zp[F] has exactly two roots, one of 
which reduces to 19533 modulo p. We compute this root, which is the value of 
the Lig-action on j{Ei), to two p-adic digits of precision. We continue to lift 
the whole cycle to two p-adic digits of precision, and update j{Ei) according to 
formula (1) to obtain j{E 2 ). Starting from j{E 2 ), we now lift the cycle to four 
p-adic digits of precision, compute j{E^) from this, and so on. We obtain 

j{E) = 41556 -k 0(p) 

= 41556 - 17953p-kO(p2) 

= 41556 - 17953p - 51143p2 - 17793p6 -k 0{p^) 

= 41556 - 17953p - 51143p2 - 17793p6 -k 45123p^ -k 52596p6 -k 18237p® 
-k42211p7 + 0(p6) 

= 41556 - 17953p - 51143p2 - 17793p6 -k 45123p^ -k 52596p6 -k 18237p® 
-k42211p7 + 45716p8 -k 58788p® -k 18836pi6 - 4101p6i - 60004pi2 
-24668p66 -k 27527p64 - 58942pi6 + 0{p^^). 
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From the estimate in [2, Chapter 7, Section 6], we know that we need approxi- 
mately 3000 decimals of accuracy. As we have 10^°°° « we compute j{E) 
up to 585 p-adic digits. The class group Cl(£>) = Z/2Z x TilQ&Ti is generated 
by a prime of norm 2 and a prime of norm 29, so we find all 132 conjugates of 
j{E) under Gal(i7/AT) up to 585 p-adic digits using the modular polynomials ^2 
and ^ 29 - In the end, we expand the polynomial of degree 132 to find the class 
polynomial 

Fd= n {X-j{Ei))&Z[X] 
mc\(D) 

We now compute a root of Ad G F^, and note that we can check our computa- 
tions so far by testing whether Ad splits completely in Fg[A]. One of the 132 
roots is j = 5215470850369 G Fg, and an elliptic curve with this j-invariant is 
Aa : = x^ + ax-a with a = 27j 7(4(1728 - j)) = 1460967812073632 G F,. 

We know that A^ has CM with Ojj and that its trace of Frobenius equals ±t. 
To test whether it actually equals t, we look at the order of A = (1, 1) G Ea{Fg). 
We see that (g + 1 + t)P ^ O = {q+1 — t)P, so Ea must have trace equal to t. 
We conclude that the curve defined by 

= x^ + 1460967812073632a; + 543093418434659 
over F 2004061230508291 has exactly 2004061320040618 rational points. 

5 Using Class Invariants 

A serious drawback of the complex multiplication method is that it requires the 
computation of the class polynomial Ad, which grows rapidly in size with D, 
and is already sizable for moderately small discriminants. Around 1900, Weber 
[12] computed generating polynomials for Hilbert class fields using the values at 
CM-points of modular functions of higher level instead of the j-function. These 
techniques have become important again in an algorithmic context, and many 
complications from Weber’s days are now well understood [4,10]. The classical 
theory of class invariants is firmly rooted in complex analytic arguments, but 
much of it can be made to work in our non-archimedean setting. This section 
gives an indication of what can be done, leaving a fuller treatment to [1] . 

The complex multiplication method to compute the class polynomial Ad 
is based on the fact that its zeroes are the j-invariants of the elliptic curves 
in characteristic zero having complex multiplication with Od. In the complex 
analytic setting, we simply list the complex lattices (up to scaling) giving rise 
to such curves and compute their j-invariants to sufficient accuracy using the 
g-expansion of j. In the p-adic setting, we first compute one such curve in the 
finite set E11 d(Fp). Then we lift the action on E11 d(Fp) of the group I{p) of 
OD-ideals prime to p to an action on the set X{Cp) of their Cp-lifts in such 
a way that the induced action on the subset E11 d(Qp) of canonical lifts is the 
standard Galois action coming from C1(A>). This enables us to compute the finite 
subset E11d(Qp) C A(Cp) Cp in Cp consisting of the zeroes of Ad. We use a 
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Newton type lift of our curve in Ell£i(Fp) to Ell£)(Qp) together with the explicit 
Galois action of C1(I?) on Ell£)(Qp), which we handle by exploiting modular 
polynomials. 

We now want to replace j by modular functions of higher level. These are 
elements of the modular function field T = Un>i as defined in [6, Chapter 6, 
Section 3] . Here denotes the field of modular functions of level n over Q . It can 

be viewed as the function field of the modular curve X(n) over the cyclotomic 
field Q(Cn)- Over = Q(j), it is generated by the Fricke functions of level n, 
which are normalized x-coordinates of n-torsion points on an elliptic curve with 
j-invariant j. 

The modular functions / we use are integral over Z[j], so they are given as 
the zero of some irreducible polynomial Wf(X,j) G Z[j,X], If we specialize j to 
be the j-invariant of a curve E G Ell£i(Qp), then the roots of the polynomial 
'Ef{X,j{E)) G H\X], which has integral coefficients, lie in the ray class field 
Efn of conductor n of it' = Q{'/D), with n the level of /. It is known that for 
many choices of ‘small’ /, one or more of these roots are class invariants that 
actually lie in the Hilbert class field H = Eli GL Hn of K. If we can determine 
which roots end up in ii, and compute the explicit Galois action of Gl(ii) on 
such a root, we can compute its irreducible polynomial over if or Q just like we 
did this for j. In the complex analytic setting the tool to perform these tasks is 
Shimura’s reciprocity law [4,10]. It tells us in which points r G if C H a function 
/ should be evaluated to obtain a class invariant, and describes the conjugates 
of /(r) over if as the values of conjugates of / over Q(j) in certain other points 
t' G K. These values can be approximated in C using the q-expansions of / 
and its conjugates. Once we have computed the irreducible polynomial of /(r) 
over if or Q, one can use the relation <f"/(/(r), j(r)) = 0 to obtain information 
on j (r) itself. 

In a p-adic setting, we cannot deal with j and / as functions on the complex 
upper half plane, and the expansion of modular functions as Fourier series in 
q = has no non-archimedean analogue when dealing as we do with GM- 

curves, which have integral j-invariants. What we do have is an action from class 
field theory of the 0_D-ideals coprime to n on the roots of Ef{X,j{E)) G El [X] 
in Hn associated to j{E). This action factors via the class group Gl(n^D) of the 
order of discriminant n^D, which is a non-maximal order for n > 1. 

Example 1. The modular function j : H — >■ C has a holomorphic cube root 
72 : H — >■ C that is modular of level 3 and has E~^^{X,j) = X^ — j. It is the 
unique root of X^ — j having a rational g-expansion. If D is not divisible by 3 
and we write Od = Z[r] with r -I- f = 0 mod 3, then 72(1") is a class invariant, 
and its ‘size’ is only one third of that of j(r). 

li E ■. y'^ = + ax + b \s in Ell£)(Qp) and ci, . . . , C4 are the 4 roots of its 

3-division polynomial, then 



( 3 ) 



-48a 

2a — 3 (ciC 2 -I- C 3 C 4 ) 
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is a cube root of j{E). As there are 3 ways to divide the roots ci, . . . , C 4 in two 
sets of two roots each, formula (3) yields 3 distinct cube roots of j. Note that 
there is no obvious way to single out an expression in (3) as being ‘the function’ 
corresponding to 72 . 

If / is an Oo-^deal prime to 3p, the isogeny E ^ Ej induces an isomorphism 
E[3] Ej[3] on the 3-torsion subgroups, and maps each possible cube root of 

j{E) in (3) to some well-defined cube root of j{Ej). This is the Galois action 
of the Artin symbol of I, which maps j{E) to j{Ej), on these cube roots. It 
provides an extension of the map pi : Ell£i(Qp) — >• Ell£i(Qp) to the set of 
the cube roots of (the j-invariants of) the elements in Ell£)(Qp). As all these 
cube roots are p-adically integral, reduction of this map provides an extension of 
Pi : EIId(Fp) — >• Ell£i(Fp) to the set of cube roots in Fp, provided we have 3. 
In a similar way, we have an extension of the map pi : Xi)(Cp) — >■ Xu(Cp) to 
the set of cube roots. 

Example 1 nicely illustrates that the cube roots of j are functions on the modular 
curve A (3), the points of which can be viewed as isomorphism classes of elliptic 
curves with complete 3-level structure: in order to have a well defined value not 
only the j-invariant of the curve but also some ‘ordering’ of 3-torsion points is 
required. 

As the field is generated over Q(j) by Fricke functions, a modular function 
/ of level n is always a rational expression in j and the roots of the n-th division 
polynomial of a curve with j-invariant j. As in Example 1, the action pj of 
an OjD-'ide&l I coprime to pn on Xu{Cp) naturally maps the roots of Ef{X,j) 
for j G Ac(Cp) to the roots of Ej{X, pi{j)). This observation suffices to treat 
modular functions providing class invariants by p-adic methods similar to that 
in section 3. 

Suppose / is an integral modular function of level n that is known to pro- 
vide class invariants for Od at certain CM-points for O 45 in H. Then we know 
that for every curve E G Ell£)(Qp), certain roots of 'Ef{X,j{E)) G Qp[A] are 
class invariants, so they lie in Qp. If if G Ell£i(Fp) is the reduction of E, then 
Ef{X,j{E)) G Fp[A] will have roots in Fp, and we want to know which roots 
in Fp arise as the reduction of class invariants. Let /3 be a root, and assume for 
simplicity that 'Ef{X,j{E)) is separable. (This is usually the case in practice as 
p is not too small, of size 0(i?^+®).) The roots of Ef{X,j{E)) G Qp[A] all lie 
in iL„, and they are class invariants exactly when they are fixed by the Galois 
group 

Gal{H„/H) ^ {ODlnOD)*IO*ii = ker[Gl(n^£>) ^ Gl(Zi)]. 

It follows that (3 arises as the reduction of a class invariant if it is fixed by the 
maps px : Ell£)(Fp) — >• Ell£i(Fp) for a set of generators x of {O d / nO d)* / . 
We can compute px{j{E)) as before when x is sufficiently smooth, and we need 
the extension of the action to the roots of Ef{X,j{E)). In theory this can be 
done by working with explicit Weierstrass models and an explicit description of 
/ in terms of n-torsion points as in Example 1. In practice we work with modular 
polynomials for prime degree i isogenies relating the roots of Ef(X,ji) to that 
of Ef{X,j 2 ) when ji and j 2 are j-invariants of f-isogenous curves. Such modular 
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polynomials are in many ways similar to the modular polynomials Y) aris- 

ing for the j-function, and they may be found using complex analytic methods 
involving g-expansions. 

Suppose we find that a certain root /3 G Fp of <Ff{X,j{E)) is the reduction 
of a class invariant j3. Then we lift the pair {E, 0) to the pair {E, 0) consisting 
of the canonical lift E/Qp and the class invariant 0. In theory this can be 
done by lifting j{E) as in section 3, using cycles of smooth isogenies, and then 
compute the Hensel lift of 0 to the root 0. This method makes use of the modular 
polynomials for j, which are big. In many cases the corresponding polynomials 

for / are much more pleasant to work with, so it is better to lift isogeny 
cycles not in terms of j-invariants but in terms of a root of Ef(X,j), and find 
the resulting j-invariant from its corresponding root. 

All that remains is computing the conjugates of the pairs (E,0) under Cl (I?). 
This is done in exactly the same manner as before, using small primes that 
generate C1(Z?). We are only interested in the conjugates of 0, so we use the 

r 

modular polynomials to compute these conjugates. In the end we expand the 
polynomial 

fL= n iX-00€Z[X] iorODlX].) 
meci(D) 

This polynomial splits again completely in F^[A], and from a root in Fg we 
compute the corresponding j-value in F^. 

Example 2. We take for / the Weber function f, which is classically defined on H 
in terms of the Dedekind ry-function as f(r) = C 48 ^^(^^)/^('’’)- K is a modular 
function of level 48 of degree 72 over Q(j) with 

E0X,j) = (A24 - 16)3 G Z[j,X]. 

For discriminants D = 1 mod 8 with 3 \ D, it yields class invariants when eval- 
uated at appropriate points r G H. In other cases small powers of f often have 
the same property [4]. 

In principle we can express f in terms of x-coordinates of 48-torsion points, 
but there is no need to do this. It suffices to find, for EjFp a curve having CM 
with Od, first a root 0 G Fp of Ef{X,j{E)) that is the reduction of a class 
invariant 0, then a good approximation of the root 0 of E0X, j{E)) in Qp, 
and finally good approximations in Qp of the conjugates of 0 under C1(D). 
These questions ultimately reduce to computing the action of an ideal L of 
prime norm t \ pn on pairs {E',0'), with E' in Xjy{Cp) or Ell£)(Fp) and 0' 
a root of Ef{X,j{E')) in Qp or Fp. For E' we know how to compute j{E'0), 
for {0'0 we use the fact that it is a zero of the modular polynomial '1>\{0' ,X) 
and of E0X,j{E'0). Usually there are only two roots ‘!>\{0',X) that we need 
to consider, the correct one and the image under the action of L. It is of great 
help that the modular polynomials are quite a bit smaller than the classical 
modular polynomials for j. For small (. they are really small, like 

Y) = (A® -Y){X - Y0 + 5XY 
<P0X, Y) = (X^ -Y)(X - Y0 + 7(XY - X^Y0. 
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For = 13 it takes at least two of these pages to write down but we have 

Y) = - Y){X - 5 . i^xY 

+ 13(X2rl2 + ^12^2 4X^0yi + 4X^0yi + QX^Y^ + 6X8^6). 

6 An Elliptic Curve Having 10^° Points 

We illustrate the power of the method presented in the previous section by 
constructing an elliptic curve having exactly N = 10^*^ rational points. 

Just as in section 4, we first look for a suitable discriminant. We write N = 
q + l — t, and by looking at |t | slightly less than 2-\/ N = 2-10^8^ find for trace 
t = 1999999999167682 that the number q = lO^^ + t — 1 is prime and that 

A = f-Aq = -2^2 . 32 . 52 . Yj2 . 3gy2 . . 23537 

has a large square factor leading to U = —92806391. The corresponding class 
group C1(I?) has order 15610. Computing the Hilbert class polynomial for our D 
would require an accuracy of 313618 decimals, which is clearly not practical. 

Instead, we notice that we have D = 1 mod 8 and D ^ 0 mod 3, so we can 
use the classical Weber function f to compute a class polynomial for H{K). 
We first compute an elliptic curve in Ell£i(Fp) for a ‘small’ prime p. We have 
D = 1 mod 8, and the smallest s G Z>o with p = (s^ — 4L>)/4 prime is s = 132, 
leading to p = 92810747. The first curve of trace 132 we encounter is 

^'lose : + 1086a; — 1086 of j-invariant 37202456. As E has all three of its 

two-torsion points defined over Fp, its endomorphism ring is Od, not O^d- 
We now have to determine which root /? of the polynomial E^{X,j{E)) = 
— 16)8 — G Fp[A] is the reduction of a class invariant P G Zp. We 

are lucky since ±21677132 are the only two roots in Fp. Since —0 is also a class 
invariant, it does not matter which root we pick. We take /3 = 21677132. 

For the smooth ideal inducing pa we pick (a) = (— 420±7Tp), which factors as 

(11, 8 ± 27Tp) • (17, 4 ± 27Tp)2 • (23, 16 ± 2npf ■ (31, 13 ± 27Tp) • (41, 30 ± 27Tp). 

Just as in section 4, we compute the cycle in Fp for the j -invariants: 

37202456 ^ 4967239 ^ ^ 21402782 ^ 37202456. 

Using this cycle, we can also compute the cycle for p. For instance, the modular 
polynomial <p\i{P,Y) G Fp[F] has two roots: 32604444 and 60476019. In order 
to determine which root to take, we note that /? is a root of EpX, j{EL^P)) = 
(A^"* — 16)8 — j(ALii)A^^ G Fp[A]. We find that 60476019 is the root we need. 
Continuing like this, we get the following cycle for /3 in Fpi 

21677132 ^ 60476019 ^ ... ^ 53004472 ^ 21677132. 

Just as in section 4, we lift E/¥p to Ui/Qp by lifting the coefficients of its 
Weierstrass equation. We could now compute the canonical lift E in two p-adic 
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digits just like we did in section 4, and use that information to compute /? G Zp in 
two p-adic digits accuracy. Indeed, /3 is a root of ^j{X,j{E)) G Zp[X], and since 
we have (/3 mod p) = (3, this root is just a Hensel lift of (3. This approach has 
the disadvantage that we have to use modular polynomials for the j-function. 

Instead, we lift j3 to a, root j3\ of G Zp[X]. Now we lift the cycle 

that we had for /3 G Fp to a cycle for (3\ G Zp by applying the small modular 
polynomials for f once more. Since we know that (3^'’ is a root of Ef{X, j{E[‘^'^)), 
we can compute 






{{(3[^Y - 16)3 






24 



and use this value to update j{Ei) as in formula (1) to a value of the j-invariant of 
the canonical lift E that is accurate to two p-adic digits. Knowing j{E) mod p^ , 
we can lift /3 G Fp to a root of 'E^{X,j{E)) in Zp that is accurate to two p-adic 
digits. We continue this process of doubling the precision until we have [3 with 
sufficient accuracy. The first four cycles yield: 



/3 = 21677132 + 0(p) 

= 21677132 + 28966941p + 0{p^) 

= 21677132 + 28966941p + 7010373p2 + 31182954p3 + 0{p^) 

= 21677132 + 28966941p + 7010373p2 + 31182954p3 - 33808617p‘‘ 
+27519307p3 - 31601027p® - 36195013p^ + 0(p8) 

= 21677132 + 28966941p + 7010373p2 + 31182954p3 - 33808617p‘‘ 
+27519307p3 - 31601027p® - 36195013p^ - 8331811p® 
-33957007p® - 18191700pi° + 5895954p3i - 42670221pi2 
+23637278p33 - 40784695p3^ + 7754196p3® + 0(p3®). 



We expect to need [313618/72] = 4356 decimals of accuracy, so we compute (3 
upto 550 p-adic digits. The class group C1(D), which is cyclic of order 15610, is 
generated by a prime of norm 11. We can thus compute all the conjugates of f3 
under Gal{H/K) to 550 p-adic digits using the modular polynomial In the 
end, we expand the polynomial of degree 15610 to find the class polynomial 

4= n {x-'^^)€Z[x], 

mc\(D) 

which we reduce modulo q to compute a root 7 G Fg. From 7, we compute the 
corresponding j-value and write down a curve E with that /-invariant. We then 
know that E has CM with Od, but we still need to check whether the trace really 
equals t. This turns out not to be the case, so we conclude that the quadratic 
twist of E, given by 

y'^ =x^ + 669397215131271955483581235905a: -t 363369366443977510319399421188 

over Fq, with q = lO^^^ -|- 1999999999167681, has exactly lO^^ rational points. 
Checking that this curve indeed has the required number of points is an easy 
matter for the current point counting algorithms. 
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Abstract. We discuss the situation where a curve C, defined over a 
number field K, has a known Jf-rational divisor class of degree 1, and 
consider whether this class contains an actual A-rational divisor. When C 
has points everywhere locally, the local to global principle of the Brauer 
group gives the existence of such a divisor. In this situation, we give an 
alternative, more down to earth, approach, which indicates how to com- 
pute this divisor in certain situations. We also discuss examples where C 
does not have points everywhere locally, and where no such A-rational 
divisor is contained in the A-rational divisor class. 



1 Introduction 

The following result is typically proved as a direct consequence of the local to 
global principle of the Brauer group (see, for example, [2] or p.30 of [3]). 

Lemma 1. Let C he a curve defined over a number field K with points every- 
where locally. Then any K -rational degree 1 divisor class T> contains a K -rational 
divisor D. 

Such divisor classes have relevance to the application of second descents [4], as 
well as the application of the Brauer-Manin obstruction to higher genus curves, 
where such a class T> is used to obtain an embedding P ^ \P] — V from C{K) 
to J{K), the Mordell-Weil group of the Jacobian. This embedding can sometimes 
be used to find information about C{K). 

Our intention here is to describe, in a concise and explicit manner, how the 
problem of finding a rational divisor in a rational divisor class corresponds to 
finding a rational point on a certain algebraic variety. We give an example of 
how this description can be used in practice to find a rational divisor explicitly, 
given a rational divisor class. 
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2 The Brauer-Severi Variety of a Divisor Class 

Let C be a curve defined over a field K and let K he & separable closure of K . 
We say a divisor class V G Vicc{K) is rational if it is fixed under the action of 
Gal{K/K). This means that for any divisor D G V and ct G Gal{K/K), we have 
that '^D and D are linearly equivalent. In the language of Galois modules and 
Galois cohomology, we have 

Picc{K)^ = H^{K,Picc{K)) = {T> G Picc{K) : 2? is a rational divisor class}. 

This group is different from Picc(iC), which simply consists of the linear equiva- 
lence classes in Divc(iL). There is an obvious embedding Picc(iC) C Picc{K)^ 
and we identify Picc(iC) with its image. 

For D G Divc(iC), we adopt the standard notation 

C{D) = {/ G K{C) : (/) > -D}. (1) 

This is a finite dimensional vector space over K. We write 1{D) for its dimension. 
The Riemann-Roch theorem asserts that, for any divisor D G Divc(RT) and any 
canonical divisor k of C, we have 1{D) — 1{k — D) = deg D — g + 1. Furthermore 
1{D) only depends on the equivalence class of D and we write 1{[D]) = 1{D). 

We write 

Vd(K) = ^C(D) (2) 

for the complete linear system of D. Via the map / i— >■ (/) -I- D, we see that this 
set is in bijection with the set of effective divisors linearly equivalent to D: 

Vd(K) ^ V[d](K) := {D' G Divc(iF) : D' >Q and D' ~ D}. (3) 

Let V he a rational divisor class with 1{T>) > 0. Then V-d{K) is fixed under 
Gal{K /K) and has the structure of a Galois set. Generalizing the above notation, 
for an extension L of K, we write Vv{L) for the effective divisors of C defined 
over L and in T>. 

The functor Vx> is represented by a scheme over K, which is called the Brauer- 
Severi variety associated to T>. 

It follows that the following are equivalent. 

- Vv{K) ^ 0 

- Vt>{k) ~ 

- T> contains a rational divisor. 

In some cases it is easy to see that the conditions above hold. For instance, 
if 2? is a rational divisor class with 1{T>) = 1, then over K, there is a unique 
effective divisor D representing V. Gonsequently, D is fixed under Gai{K/K) 
and therefore is itself rational. 

If a curve has a rational point Pq then the property above is sufficient to 
deduce that any rational divisor class contains rational divisors. We use that for 
any divisor D and point P the following inequalities hold: 



1{D) <1{D + P) <1{D) + 1. 



(4) 
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It follows from (4) that for any divisor class V there exists an integer n such 
that 1{T>+ [nPo]) = 1- The argument above shows that the rational divisor class 
V + [nPo] contains a rational divisor D and therefore D — nPo G V. 

In particular, if a curve C over a number field K has points everywhere 
locally and 1{T>) > 0 then V-d has rational points everywhere locally. Hence, 
Lemma 1 is equivalent to the assertion that the local-to-global principle applies 
to the Brauer-Severi varieties V-d- For 1{T>) = 2, we have that V-d is a curve of 
genus 0, and the Hasse-Minkowski theorem confirms that such varieties obey a 
local-to-global principle. 

In fact, without assuming local solvability of C, the geometry of Vx> still 
allows deductions to be made about represent ability of divisor classes by rational 
divisors. For instance, any T> with l(T>) = 2 is representable by a divisor over a 
quadratic extension of K, since any curve of genus 0 has quadratic points. 

We now sketch how one can proceed, given a rational divisor class T>, to 
derive an explicit model of Vt> in such a way that finding a rational point on 
it allows the construction of a representing rational divisor. Suppose that T> is 
represented by a divisor D over some finite extension L = K (a) of K of degree, 
say, d. 

1. Determine a basis /i, . . . , fi(D) G T(C) of C{D). 

2. Over L, we have Vx> ~ pd^)-i via the inverse of the map 

{tl tl{D)) ^ D + {tifi + 1- ti(D)/i(D)). 

This establishes a model of Vx> over L, with {ti : . . . : tn^o)) as projective 
coordinates. Putting t\ = 1 yields an affine chart (^ 2 , ■ ■ • , ii(D)) of Vd over L. 

3. In order to descend our model of V-d over L to a model over K, we compute a 

representation of A = D+{fi+t2f2^ ti(D)fi(D)) G F)ivc(L(t2, • • ■ , t/(D))) 

corresponding to the generic point (1 : t 2 '■ ... : t;(_D)) on V-d as a scheme 
over L. 

Let C be given as a plane curve with coordinates X and Y in general 
position over K. The effective divisor Dt can be described by the equa- 
tions g{X) = 0,P = h{X), with g,h G L(t 2 , ■ • ■ , with g monic, 

deg((/) = deg(D) and deg{h) = deg{D) — 1, since we have taken X and Y 
such that degeneracies do not occur. 

4. We substitute ti = and write 



d-l 

di^) = '^ 9 k{X)a\ where gk{X) G K{{U^j},^j)[X], 

and similarly for h{X). A point (1 : ti : • • • : tz(D)) corresponds to a divisor 
over K precisely when that divisor can be described by equations not in- 
volving a. Thus, we are led to consider the equations obtained by insisting 
that gk{X) = hk{X) = 0 for k = 1, . . . ,d — I as polynomials in X. Those 
equations define an affine chart of Vt> over K, with coordinates Uj over K. 
To get a model of Vd, one can take the projective closure. 
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5. If one finds a point (Uj), one can reconstruct g,h from these values and 
obtain a description of a iC-rational divisor in T>. Equivalently, one can re- 
construct {1 : t 2 ■ ■ ■ ■ ■ ti(D)) and thus obtain a iC-rational specialization 

of A- 



Of course, the procedure described above only applies to divisor classes satis- 
fying l(T>) > 0. In general, one should select some divisor Dq over K, preferrably 
of minimal positive degree, and an integer n such that l(T>+ [nD^]) is minimally 
positive. One can then apply the procedure to that divisor class and derive a 
suitable representative of V from the result, or conclude that none exists. 



3 Finding a Rational Divisor on a Cnrve of Genns 2 

We will give an example of this for a genus 2 curve 

C:Y^ = UX^ + + . . . + /o = F{X), with F{X) G K[X] (5) 

and a rational divisor class B of degree 3, represented by an effective divisor 
defined over a quadratic extension of K. 

We assume that is a rational divisor class with B = [Pi + P2 + P3] and 
Pi,P2,Ps G C{K{\fd)), where the Pi are not all Weierstrass points. It follows 
that 1 {B) = 2. We can arrive at an equation for Vg in the by following the outline 
in Section 2. We will give an account that can be read independently, but point 
out the correspondences with the general algorithm. 

Let Gt{x) G K{\fd)[t][x] be the cubic in x such that y = Gt{x) passes through 
the points Pi, P2, P3 for all values of t. For any value t G K{Vd) we have 

{x - xi){x - X2){X - X3) ^ 
y-Gt{x) 

This allows us to write down a basis of B{B) for Step 1 in Section 2. We find 
fi = 1, /2 = coordinates (1 : t) on C{B). 

For any value of t, we have that the identity 

{y = G{x)}r\C = Pi + P2 + P3 + {xi,yi) + {x2,y2) + {x3,y3) - 30, (6) 

where O is any canonical divisor^ of C. The divisor {xi,y\) + {x2, y2) + (a^3, J/3) can 
be described by {Gt{x) = 0,j/ = Ht{x)}, where Gt{x),Ht{x) G K{'/d)[t][x] are 
a monic cubic and a quadratic in x respectively. This is the required description 
for Step 3 of Section 2. 

Substituting t = ti + y/dt2, we get that the coefficients Ci of are of 

the form Cj = 01^0(^11^2) + v^Ci_i(ti, t2)> with Cij G K[ti,t2], and similarly for 
Pit- Setting the -\/(i-components to 0 yields 6 equations in ti,t2 over K, which 

^ The divisor O = oo^ -I- oo“, consisting of the intersection of C with X = co, is a 
popular choice among people computing with curves of genus 2. 
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describe the genus 0 curve Vb- This corresponds to Step 4 of Section 2, where 
we put t 2 = t 2 ,l + '/dt 2 ^ 2 - 

If C has points everywhere locally, then Lemma 1 asserts that Vb has a 
rational point. Let Ctg{x), Htg(x) correspond to such a point. Then the divisor 
class B is represented by the rational divisor B = 30 — { Ctg (x) = 0 , y = Htg (x) } . 
This corresponds to Step 5 of Section 2. 

Of course, since there exists O, which is of degree 2 and defined over K, the 
above can be applied to the rational divisor class T> = [Pi + P2 + ^3 — O] of 
degree 1, which is represented by the rational divisor D = B — O. 

We illustrate the above ideas with a detailed worked example. We first ob- 
serve how a iL-rational divisor class can arise naturally, in such a way that the 
contained iL-rational divisor is not immediately apparent. 

Let C be the genus 2 curve, defined over Q, 

C : = F{X) = -X^ -X^ - 2X^ - 2X^ + X^ -2X + 2, (7) 

which is easily checked to have points in IR and every Qp. One can perform a 
2-descent on the Mordell-Weil group f7(Q) of the Jacobian, as described in [6], 
using the map 

y : J7(Q) ^ Q(0)7Q*(Q(0)*)2 : [^(x„ y,)] 

where 0 is a root of F{X). One of the steps of the 2-descent on J7(Q) is the 
computation of the kernel of /r, generated by 2J7(Q) and [Pi + P{ — O], where 
Fi = (^ + ^ + ^a) and a = 7—55, with P{ denoting the conjugate of Pi 

with respect to Q(a)/Q. Let Pf be the hyperelliptic involute of Pi. Then general 
properties of y (see [6] or p.55 of [3]) guarantee that [P^ +P[ — 0] G 2j7(Q(o!)), 
and computing preimages under the multiplication by 2 map on J7 (Q(q;)), one 
finds Pi = [(U-\/^, 2q;)-|-(1— 7^, 2a)— O] which satisfies [Pj“-|-P{— O] = 2Pi. 
Clearly = —Pi, since conjugation merely negates the y-coordinates. Let 

Pi = {\ + \cc,l^ + i|a), P2 = (1 + 7=5, 2a), P3 = (1 - 7=5, 2a), 

P = [Pi] -f Pi = [Pi + P 2 + P 3 - O], where a = 7=^. '' '' 



Then 

V' = [P[] + P'l = [P[] - Pi = [Pi'j - 2Pi -k Pi = [Pi'j - [Pf -k P[ -0]+Vi=V, 

so that P is defined over Q. We now have a naturally occurring divisor class P of 
degree 1, which is defined over Q, but whose naturally occurring representative 
Pi -k P2 -k P3 — O is not itself defined over Q. This is a common outcome of an 
application of 2-descent on J (Q) for genus 2 curves. Note that such P are of some 
interest, as they allow an embedding of C(Q) into 7(Q) via the map P >->■ [P] — P, 
even when no obvious member of C(Q) is available. 

Since our curve has points everywhere locally, we know that P does contain 
an actual Q-rational divisor. We now illustrate how this can be found in practice. 
One first finds the general Y = G{X), through Pi,P2,P3, where G{X) is cubic 
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in X, and where there is a free parameter t, since we have one less than the 
number of points required to define the cubic uniquely. This parametrized family 
of cubics is 



G{X) = + (§ - f ^ - f)X^ 

+(-3 + 7t+^ + ^)X + 9-3t-^-^. 



(10) 



Computing G{X)'^ — F{X), where F{X) is as in (7), and removing the cubic 
factor {X — X{Pi)){X — X{P 2 )){X — X{P^)) leaves the residual cubic 



G{X) = 10(1 + t^)X^ + (35 + 2a + 30t - 8ta - 27>t^ - 2fa)X'^ 

+(-50 - 4a - 60t + 16to + 70f + 4£^a)X (11) 

+30 + 12a + 180t - 8to - 30^^ - 12^2^. 



Let (xi, t/i) + (x2, 2/2) + (a^3, ys) — O £ V. Then X\,X 2 , xz are the roots of G{X), 
for some choice of t G Q(a). Furthermore, the yi = —G{xi), where G{X) is 
as in (10). We now compute the quadratic + = H{X) which passes through 
(a;i, -G(a;i)), {x2, -G{x2)), {xz, -G{xz)), namely 



H{X) = ((15 - 60t - 15f2 - 4a - 4ta + 4t^a)X^ 

+(-30 + 120t + 30f2 + 8a + 8to - 8t"^a)X (12) 

+90 - 60t - 90f2 - 4a - 24ta + 4t2a)/(10 + lOt^). 

We have now parametrized divisors of the form (xi,yi) + {x 2 , 2/2) + {xz, 2/3) — O 
which are in T>. Our requirement for (xi,yi) + (x2,y2) + (xz,yz) — O to he 
Q-rational is the same as requiring that the ratios of the coefficients of G{x) 
are Q-rational (giving six polynomials in t, which can be reduced to three poly- 
nomials on the assumption that a given cofficient of C{x) is nonzero; however, 
we shall prefer to write out all six polynomials in full), and that the actual co- 
efficients of P[ (x) are also Q-rational (giving three polynomials in t) . This gives 
in total nine polynomials in t total that must be Q-rational. Let t = + + t20t. 
Then on computing the coefficients on a in our nine expressions, we find nine 
quartics in ti, ^2, all with a common factor of tl + 55t| + 15^2 + 1- Then 

tl + 55t^ + 15f2 + 1 = 0 (13) 

is our desired curve of genus 0, which has points everywhere locally, and hence 
globally. The solution of smallest height is ti = 1/7, <2 = —1/7, corresponding to 
t = 1/7 — a/7. Substituting this into G{X) and H{X), and removing from G{X) 
a factor of —5(5 + 2a)/49 (permissible, since the roots of C{X) are unaffected), 
gives 

G{X) = 2X^ + X^ + 2X + 2, H{X) = -^X^ + X -1. (14) 

These G{X), H{X) define (xi,yi) + {x 2 , 2/2) + (a^3, Vs) ~ G, which is our desired 
Q-rational divisor in our given Q-rational divisor class P. We summarize the 
above as follows. 
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Example 1. LetC he the curve (1) and T> he the ^-rational divisor class in (9). 
Then T> contains the ^-rational divisor {x\,yi) + (x2,t/2) + (a^3,J/3) — O, where 
xi,X2, X3 are the roots of C{X) and t/i = H{xi) for i = 1,2, 3, with C{X),H{X) 
as in (14)- 

We have written a Maple file at [1] which performs the above computation for 
any curve of genus 2. 



4 Examples Where the Rational Divisor Class Contains 
No Rational Divisor 

Suppose now that our genus 2 curve is defined over a number field K and is of 
the form 

n -.Y‘^ = kFi{X)F2{X), with Fi{X) = 03^3 + 02^2 + aiX + ao, 
k G K*, each Ui = gi + hi^/d G K{\fd), and Fi{X),F2{X) conjugate. 

We shall also assume that Fi{X) is not defined over K. Let ei, C2, 63 denote the 
roots of Ei(X). This is another situation where we have a naturally occurring de- 
gree 1 divisor class V = [(ei, 0)-|-(e2, 0)-|-(e3, 0) — O] which is defined over K, even 
though the given representative is defined over K{'/d) and not generally over K. 
If Ti has points everywhere locally, we know that T> must contain a divisor defined 
over K, but we do not make that assumption in this section. The parametrized 
familiy of cubics through (ei, 0), (62, 0), (63, 0) is simply Y = G{X) = tFi{X), 
where t = t\+t2^fd and t\,t2 are iG-rational parameters. Replacing Y by tFi{X) 
in y2 _ kFi{X)F2{X) and removing the known factor Fi{X) gives the residual 
cubic C{X) = t'^Fi{X) - kF2{X). Let {xi,yi) + (x2,y2) + (x3,y3) - O G V. 
Then xi,X2,X3 are the roots of C{X), for some choice of t G K{\fd). Further- 
more, each yi = —G{xi); we compute the quadratic Y = H{X) passing through 
the {xi,—G{xi)), namely 

H{X) = 2k£{{g2h3- g3h2)X‘^ + {gih3- g3hi)X + gQh3- g3h3), where 
i = {-k + t\d - t\){g3t2 + h3ti)d - {k + t\d - t\){g3ti + h3t2d)\/d. 

Now, suppose that {xi,yi) + (x2, 2/2) + (2^3, 2/3) — O is defined over K. Then the 
ratios of coefficients of G(X) must be in RT, giving the six equations 

(k + tld-tf)(-k + t2d-t^)(g,hj-gjh^) = 0, for {i,jj C {0, 1,2,3}, i < j. (17) 

Furthermore, the coefficients of H{X) must be in K, giving the three equations 

{k + tld - t\){g3ti + h3t2d){gih3 - gshi) = 0, for i = 0, 1, 2. (18) 

Inspecting (17), we note that we cannot have all gihj — gjhi = 0, since then our 
original curve "H would have zero discriminant. So, one of the first two factors 
in (17) must be 0, giving that Norm(t) = ±k. If Norm(t) = —k then a similar 
argument (we have placed the details in the file [1]) applied to the equations (18) 
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shows that either % has zero discriminant (which is not permitted) or that F \ , F 2 
are each defined over K (which is also not permitted) . In summary, if T> contains 
a iC-rational divisor then we must have Norm(t) = k for some t. 

Conversely, if Norm(t) = k for some t € K(\/d), then the iC-rational divi- 
sor (xi, 2 /i) -I- {x 2 ,y 2 ) + ( 2 : 3 , 1 / 3 ) - O in P is defined by 



2 

Co{X) = tFi—t'F2{X), Ffo{X) = —4:k^d{g3t2 + h3ti)'^^{gih3 — gshi)X^. (19) 

i=0 



We summarize this as follows. 

Lemma 2. Let % ■.Y'^ = kFi{X)F 2 {X) be a curve of genus 2 of the type (15), 
defined over a number field K, where Fi{X) is defined over K{-\fd) and not 
over K. Let T> be the K -rational divisor class [(ei,0) -I- (e2,0) -I- (c3,0) — O], 
where 61 , 62,63 are the roots of Fi{X). Then T> contains a K-rational divisor if 
and only if Normff) = k for some t G K(\/d), in which case the required divisor 
is {xi,yi) (x 2 ,y 2 ) + (x 3 ,y 3 ) — O, where X\,X 2 ,X 3 are the roots of Cq{X) and 
each yi = Ho^Xi), with Co,Hq as in (19). 

In this situation, our genus 0 curve is simply the curve t\ — dt^ = k. Of course, 
any choice of K, d, k (such as Q, 2, 5), where k is not a norm in K{'/d), will give 
an example where T> does not contain a iC-rational divisor. 
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Abstract. In this paper, we study p-divisibility of discriminants of 
Hecke algebras associated to spaces of cnsp forms of prime level. By 
considering cusp forms of weight bigger than 2, we are are led to make a 
precise conjecture about indexes of Hecke algebras in their normalisation 
which implies (if true) the surprising conjecture that there are no mod p 
congruences between non-conjugate newforms in S' 2 (/o(p)), but there are 
almost always many such congruences when the weight is bigger than 2. 



1 Basic Definitions 

We first recall some commutative algebra related to discriminants, then introduce 
Hecke algebras of spaces of cusp forms. 



1.1 Commutative Algebra 

In this section we recall the definition of discriminant of a finite algebra and note 
that the discriminant is nonzero if and only if no base extension of the algebra 
contains nilpotents. 

Let i? be a ring and let A be an i?-algebra that is free of finite rank as an R- 
module. The trace of x G A is the trace, in the sense of linear algebra, of left 
multiplication by x. 

Definition 1 (Discriminant). Let wi, ... , he an R-hasis for A. Then the 
discriminant disc(A) of A is the determinant of the n x n matrix 

The discriminant is only well-defined modulo squares of units in R. When R = Z 
the discriminant is well defined, since the only units are ±1. 

We say that A is separable over R if for every extension R! of R, the ring 
A® R' contains no nilpotents. 
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Proposition 1. Suppose R is a field. Then A has nonzero discriminant if and 
only if A is separable over R. 

Proof. For the convenience of the reader, we summarize the proof in [Mat86, 
§26]. If A contains a nilpotent then that nilpotent is in the kernel of the trace 
pairing, so the discriminant is 0. Conversely, if A is separable then we may 
assume that R is algebraically closed. Then A is an Artinian reduced ring, hence 
isomorphic as a ring to a finite product of copies of R, since R is algebraically 
closed. Thus the trace form on A is nondegenerate. 



1.2 The Discriminant Valuation 

We next introduce Hecke algebras attached to certain spaces of cusp forms of 
prime level p, define the discriminant valuation as the exponent of the largest 
power of p that divides the discriminant, and observe that there are eigenform 
congruences modulo p exactly when the discriminant valuation is positive. We 
then present an example to illustrate the definitions. 

Let T be a congruence subgroup of SL 2 (Z). In this paper, we will only con- 
sider r = /o(p) for p prime. For any positive integer k, let S'fe(T) denote the 
space of holomorphic weight k cusp forms for R. Let 

T = Z[... ,T„,...] cEnd(Afc(F)) 

be the associated Hecke algebra, which is generated by Hecke operators for 
all integers n, including n = p (we will sometimes write Up for Tp). Then T is 
a commutative ring that is free as a module over Z of rank equal to dim Sk{P). 
We will also sometimes consider the image of T in End(S'fe(F)“'"). 

Definition 2 (Discriminant Valuation). Let p be a prime, k a positive in- 
teger, and suppose that R = Poip). Let T be the corresponding Hecke algebra. 
Then the discriminant valuation of T in weight k is 

dk{T) = ordp(disc(T)). 

We expect that dk{T) is finite for the following reason. The Hecke operators 
r„, with n not divisible by p, are diagonalizable since they are self adjoint with 
respect to the Petersson inner product. When k = 2 one knows that Up is 
diagonalizable since the level is square free, and when k > 2 one expects this (see 
[CE98]). If T contains no nilpotents. Proposition I implies that the discriminant 
of T is nonzero. Thus dk{T) is finite when k = 2 and conjectured to be finite 
when k > 2. 

Let p be a prime and suppose that T = To{p). A normalised eigenform is an 
element / = X) G Sk{T) that is an eigenvector for all Hecke operators T^, 
including those that divide p, normalised so that Oi = 1. The quantity dk{T) is of 
interest because it measures mod p congruences between normalised eigenforms 
in S'fc(r). 
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Proposition 2. Assume that dk{r) is finite. The discriminant valuation dk{r) 
is positive (i.e., the discriminant is divisible by p) if and only if there is a con- 
gruence in characteristic p between two normalized eigenforms in Sk{T). (The 
two congruent eigenforms might be Galois conjugate.) 

Proof. It follows from Proposition 1 that dk{T) > 0 if and only if T 0 Fp is not 
separable. The Artinian ring T ® Fp is not separable if and only if the number 
of ring homomorphisms T 0 Fp — >■ Fp is less than 

dimp^ T (g) Fp = dime Sk{T). 

Since dk{T) is finite, the number of ring homomorphisms T (g) Qp — >■ Qp equals 
dime Sk{T). The proposition follows from the fact that for any ring R, there 
is a bijection between ring homomorphisms T — >■ i? and normalised eigenforms 
with g-expansion in R. 

The same proof also shows that a prime £ divides the discriminant of T if and 
only if there is a congruence mod £ between two normalized eigenforms in S'fc(T) 

Example 1. If T = /o(389) and k = 2, then dimc>S'2(T) = 32. Let / be the 
characteristic polynomial of T 2 . One can check that / is square free and 389 
exactly divides the discriminant of /. This implies that ^2(0) = 1 and that T 2 
generates T (g) Z389 as an algebra over Z389. (If T 2 only generated a subring of 
T (g) Z389 of finite index > 1, then the discriminant of / would be divisible by 
389^.) 

Modulo 389 the characteristic polynomial / is congruent to 

{x + 2){x + 56)(x + 135)(x + 158)(x + nbf{x + 315)(x + M2){x'^ + 387) 

{x"^ + 97x + 164) + 231x + 64) (x^ + 286x + 63) (x® + 88x‘* + 196x^+ 

113x2 + 168x + 349) (x“ + 276xi° + 182X® + 13x® + 298x^ + 316x®+ 

213x5 + 248x‘‘ + 108x3 + 283x^ + x + 101) 

The factor (x + 175)^ indicates that T (g) F389 is not separable over F389 since 
the image of (//(x + 175))(T2) in T (g) F389 is nilpotent (it is nonzero but its 
square is 0). There are 32 eigenforms over Q2 but only 31 mod 389 eigenforms, 
so there must be a congruence. There is a newform F in S'2(/o(389), Z389) whose 
02 term is a root of 

x2 + (-39 + 190 • 389 + 96 • 389^ + • • • )x + (-106 + 43 • 389 + 19 • 389^ + •••). 
There is a congruence between F and its Gal(Q389/Q38g)-conjugate. 

2 Computing Discriminants 

In this section we sketch the algorithm that we use for computing the discrimi- 
nants mentioned in this paper. 
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This algorithm was inspired by a discussion of the second author with Hen- 
drik Lenstra. We leave the details of converting the description below into stan- 
dard matrix operations to the reader. Also, the modular symbols algorithms 
needed to compute Hecke operators are quite involved. 

Let r = ro{p), and let fc > 2 be an integer. The following sketches an 
algorithm for computing the discriminant of the Hecke algebra T acting on 
Sk{r). 

1. For any given n, we can explicitly compute a matrix that represents the 
action of Hecke operators T„ on S'fe(T) using modular symbols. We use 
the second author’s MAGMA [BCP97] packages for computing with mod- 
ular symbols, which builds on work of many people (including [Cre97] and 
[Mer94]). 

2. Using the Sturm bound, as described in the appendix to [LS02], find an 
integer b such that Ti, . . . ,Tf, generate T as a Z-module. (The integer b is 
r(fc/12).[SL2(Z) :T]1.) 

3. Find a subset B of the Ti that form a Q-basis for T Gz Q- (This uses Gauss 
elimination.) 

4. View T as a ring of matrices acting on where d = dim(S'fc(F)) and try 
random sparse vectors v G Q'* until we find one such that the set of vectors 
C = {T(r;) : T G B} are linearly independent. 

5. Write each of Ti(f), . . . , Tf,(f) as Q-linear combinations of the elements of C. 

6. Find a Z-basis D for the Z-span of these Q-linear combinations of elements 
of C. (This basis D corresponds to a Z-basis for T, but is much easier to 
find that directly looking for a Z-basis in the space oi dx d matrices that T 
is naturally computed in.) 

7. Unwinding what we have done in the previous steps, find the trace pairing 
on the elements of D, and deduce the discriminant of T by computing the 
determinant of the trace pairing matrix. 

A very time-consuming step, at least in our implementation, is computing D 
from Ti(u), . . . , T},(v) expressed in terms of C, and this explains why we embed 
T in Q'’* instead of viewing the elements of T as vectors in 

An implementation by the second author of the above algorithm is included 
with the MAGMA computer algebra system. The relevant source code is in the 
file Geometry/ModSym/linalg.m in the package directory (or ask the second au- 
thor of the apper to send you a copy linalg . m) . We illustrate the use of MAGMA 
to compute discriminants below, which were run under MAGMA V2. 10-21 for 
Linux on a computer with an Athlon 2800MP processor (2.1Ghz). 

> M := ModularSymbols(389,2, +1); 

> S := CuspidalSubspace(M) ; 

> time D := DiscriminantOfHeckeAlgebra(S) ; 

Time: 0.750 

> D; 

629670054720061882880174736321392595498204931550235108311\ 

04000000 
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> Factorisation(D) ; 

[ <2, 53>, <3, 4>, <5, 6>, <31, 2>, <37, 1>, <389, 1>, ...] 

> M := ModularSymbols(997,2, +1); S := CuspidalSubspace (M) ; 

> time D := DiscriminantOfHeckeAlgebra(S) ; 

Time: 55.600 

The reason for the +1 in the construction of modular symbols is so that we 
compute on a space that is isomorphic as a T-module to one copy of >S' 2 (T'o(p)), 
instead of two copies. 

3 Data about Discriminant Valuations 

In this section we report on our extensive computations of dfc(To(p)). We first 
note that there is only one p < 50000 such that d 2 (to(p)) > 0. Next we give a 
table of values of c? 4 (/o(p)), which seems to exhibit a nice pattern. 

3.1 Weight Two 

Theorem 1. The only prime p < 60000 such that d 2 {ro{p)) > 0 is p = 389, 
with the possible exception o/ 50923 and 51437. 

Computations in this direction by the second author have been cited in [Rib99] , 
[MSOl], [OW02], and [MO02]. For example, Theorem 1 is used for p < 1000 in 
[MSOl] as a crucial step in proving that if E is an elliptic curve over Q(/rp), with 
17 < p < 1000, then not all elements of if(Q)[p] are rational over Q(pp). 

Proof. This is the result of a large computer computation. The rest of this proof 
describes how we did the computation, so the reader has some idea how to repli- 
cate or extend the computation. The computation described below took about 
one week using a cluster equipped with 10 Athlon 2000MP processors. The com- 
putations are nontrivial; we compute spaces of modular symbols, supersingular 
points, and Hecke operators on spaces of dimensions up to 5000. 

The aim is to determine whether or not p divides the discriminant of the 
Hecke algebra of level p for each p < 60000. If T is an operator with integral 
characteristic polynomial, we write disc(T) for disc(charpoly(T)), which also 
equals disc(Z[T]). We will often use that 

disc(T) mod p = disc(charpoly(T) mod p). 

We ruled out the possibility that dk{Po{p)) > 0 for most levels p < 60000 
by computing characteristic polynomials of Hecke operators using an algorithm 
that the second author and D. Kohel implemented in MAGMA ([BCP97]), which 
is based on the Mestre-Oesterle method of graphs [Mes86] (or contact the sec- 
ond author for an English translation). Our implementation is available as the 
“Module of Supersingular Points” package that comes with MAGMA. We com- 
puted disc(Tq) modulo p for several small primes q, and in most cases found a 
prime q such that this discriminant is nonzero. The following table summarises 
how often we used each prime q (note that there are 6057 primes up to 60000): 
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q 


number of p < 60000 where q smallest s.t. disc(Tg) yf 0 mod p 


2 


5809 times 


3 


161 (largest: 59471) 


5 


43 (largest: 57793) 


7 


15 (largest: 58699) 


11 


15 (the smallest is 307; the largest 50971) 


13 


2 (they are 577 and 5417) 


17 


3 (they are 17209, 24533, and 47387) 


19 


1 (it is 15661 ) 



The numbers in the right column sum to 6049, so 8 levels are missing. These 
are 

389, 487, 2341, 7057, 15641, 28279, 50923, and 51437. 

(The last two are still being processed. 51437 has the property that disc(Tg) = 0 
for q = 2,3,... , 17.) We determined the situation with the remaining 6 levels 
using Hecke operators T„ with n composite. 



P 


How we rule level p out, if possible 


389 


p does divide discriminant 


487 


using charpoly(Ti 2 ) 


2341 


using charpoly(Te) 


7057 


using charpoly(Ti8) 


15641 


using charpoly(Ts) 


28279 


using charpoly(T 34 ) 



Computing T„ with n composite is very time consuming when p is large, 
so it is important to choose the right T„ quickly. For p = 28279, here is a 
trick we used to quickly find an n such that disc(T„) is not divisible by p. This 
trick might be used to speed up the computation for some other levels. The 
key idea is to efficiently discover which T„ to compute. Computing T„ on the 
full space of modular symbols is difficult, but using projections we can compute 
Tn on subspaces of modular symbols with small dimension more quickly (see, 
e.g., [SteOO, §3.5.2]). Let M be the space of mod p modular symbols of level 
p = 28279, and let / = gcd(charpoly(T 2 ), deriv(charpoly(T 2 ))). Let V be the 
kernel of /(72) (this takes 7 minutes to compute). If C = 0, we would be done, 
since then disc(T 2 ) yf 0 G Fp. In fact, V has dimension 7. We find the first 
few integers n so that the charpoly of T„ on V has distinct roots, and they are 
n = 34, 47, 53, and 89. We then computed charpoly(T 34 ) directly on the whole 
space and found that it has distinct roots modulo p. 



3.2 Some Data about Weight 4 

The following are the valuations d = d 4 (/o(p)) at p of the discriminant of the 
Hecke algebras associated to S' 4 (/o(p)) for p < 500. This data suggests a pattern, 
which motivates Conjecture 1 below. 
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p 


2 


3 


5 


7 


11 


13 


17 


19 


23 


29 


31 


37 


41 


43 


47 


53 


59 


d 


0 


0 


0 


0 


0 


2 


2 


2 


2 


4 


4 


6 


6 


6 


6 


8 


8 


P 


61 


67 


71 


73 


79 


83 


89 


97 


101 


103 


107 


109 


113 


127 


131 


137 


139 


d 


10 


10 


10 


12 


12 


12 


14 


16 


16 


16 


16 


18 


18 


20 


20 


22 


24 


P 


149 


151 


157 


163 


167 


173 


179 


181 


191 


193 


197 


199 


211 


223 


227 


229 


233 


d 


24 


24 


26 


26 


26 


28 


28 


30 


30 


32 


32 


32 


34 


36 


36 


38 


38 


P 


239 


241 


251 


257 


263 


269 


271 


277 


281 


283 


293 


307 


311 


313 


317 


331 


337 


d 


38 


40 


40 


42 


42 


44 


44 


46 


46 


46 


48 


50 


50 


52 


52 


54 


56 


P 


347 


349 


353 


359 


367 


373 


379 


383 


389 


397 


401 


409 


419 


421 


431 


433 


439 


d 


56 


58 


58 


58 


60 


62 


62 


62 


65 


66 


66 


68 


68 


70 


70 


72 


72 


P 


443 


449 


457 


461 


463 


467 


479 


487 


491 


499 
















d 


72 


74 


76 


76 


76 


76 


78 


80 


80 


82 

















4 Speculations 

Motivated by the promise of a pattern suggested by the table in Section 3.2, we 
computed dfc(db(p)) for many values of k and p. Our observations led us to the 
following results and conjectures. 

Theorem 2. Suppose p is a prime and k > 4 is an even integer. Then 
dk{Tei{p)) > 0 unless 



(p,fc)G{(2,4),(2,6),(2,8),(2,10), 

(3.4) , (3, 6), (3, 8), 

(5.4) , (5,6), (7,4), (11,4)}, 



in which case dk{ro{p)) = 0. 

Proof. From [Rib91], mod p eigenforms on /o(p) of weight k arise exactly from 
mod p eigenforms on Io(l) of weight (k/2)(p+ 1). Moreover, there is an equality 
of dimensions of vector spaces: 

dimS'(fe/2)(p+i)(/})(l)) + dimS'(fc/2)(p+i)_(p_i)(/})(l)) = dimS'fc(/},(p)). 

Thus the dimension of S'fc(/o(p)) is bigger than the number of mod p eigen- 
forms whenever dimS'(^,/2)(p-i-i)-(p-i)(7o(l)) is non-zero. The cases of dimension 
zero correspond exactly to the finite list of exceptions above, for which one can 
explicitly calculate that dk{Po{p)) = 0. 

Note that for k = 2, however, there is a canonical identification of spaces 
%+i)(ro(i),Fp)~^2(ro(p),Fp), 

described geometrically in [Gro90]. For k = 4, the data suggests that the dis- 
criminants d 4 (Jo(p)) are significantly larger than zero for large p, and the table 
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above suggests a formula of the form 2- [p/12j (Not entirely co-incidentally, this 
is the difference in dimension of the spaces S'4(/o(p)) and S'2(p+i)(/o(l))). This 
exact formula is not correct, however, as evidenced by the case when p = 139. 
If we consider the Hecke algebra T4 for p = 139 in more detail, however, we 
observe that T4 ® Q139 is ramified at 139, and in particular contains two copies 
of the field Qi39(-\/139). Just as in the case when k = 2 and p = 389, there is a 
“self congruence” between the associated ramified eigenforms and their Galois 
conjugates. For all other p in the range of the table, there is no ramification, and 
all congruences take place between distinct eigenforms. Such congruences are 
measured by the index oi the Hecke algebra, which is defined to be the index of 
T in its normalisation T. If we are only interested in mod p congruences (rather 
than mod i congruences for £ p), one can restrict to the index of T 0 Zp inside 

its normalisation. There is a direct relation between the discriminant and the 
index. Suppose that T 0 Qp = ]([ for certain fields Ki / Qp (We may assume 
here that T is not nilpotent, for otherwise both the discriminant and index are 
infinite). Then if ip(T) = ordp([T,T]), then 

dp{r) = 2ip(r) + ^ordp(Z\(iFi/Qp)). 

If we now return to the example fc = 4 and p = 139, we see that the discrepancy 
from the discriminant dp(/o(139)) = 24 to the estimate 2[139/12J = 22 is 
exactly accounted for by the two eigenforms with coefficients in Qi39(\/l39), 
which contribute 2 to the above formula. This leads us to predict that the index 
is exactly given by the formula [p/12j. Note that for primes p this is exactly 
the dimension of S'p+3(/o(l)). Similar computations lead to the following more 
general conjecture. 

Let k = 2m be an even integer and p a prime. Let T be the Hecke algebra 
associated to Sk{ro{p)) and let T be the integral closure of T in T 0 Q (which 
is a product of number fields) . 

Conjecture 1. Suppose p>k — 1. Then 

ordp([T:T])= ^ +a{p,m), 

0 if p = 1 (mod 12), 

3- 

2- 

a(5, m) + a(7, m) ifp=ll (mod 12). 

Here (“) is the binomial coefficient “x choose p”, and floor and ceiling are as 
usual. The conjecture is very false if k ^ p. 

When k = 2, the conjecture specializes to the assertion that [T : T] is not 
divisible by p. A possibly more familiar concrete consequence of the conjecture is 
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the following conjecture about elliptic curves. The modular degree of an elliptic 
curve E is the smallest degree of a surjective morphism Xq{N) — >■ E, where N 
is the conductor of E. 

Conjecture 2. Suppose E is an elliptic curve of prime conductor y. Then y does 
not divide the modular degree of E. 

Using the algorithm in [Wat02], M. Watkins has computed modular degrees of 
a huge number of elliptic curves of prime conductor y < 10^, and not found a 
counterexample. Looking at smaller data, there is only one elliptic curve E of 
prime conductor y < 20000 such that the modular degree of E is even as big 
as the conductor of E, and that is a curve of conductor 13723. This curve has 
equation [1,1,1,-10481,408636], modular degree uie = 16176 = 2^ • 3 • 337. 
The modular degree can be divisible by large primes. For example, there is 
a Neumann-Setzer elliptic curve of prime conductor 90687593 whose modular 
degree is 1280092043, which is over 14 times as big as 90687593. In general, 
for an elliptic curve of conductor N, one has the estimate niE ^ (see 

[Wat04]). 

5 Conjectures Inspired by Conjecture 1 

First, some notation. Let y be an odd prime. Let E = /b(p), and let 

Sk{R) 

The spaces Sk carry an action of the Hecke algebra T“’", and a Fricke involution 
Wp. If 1 G i?, the space Sk can be decomposed into + and — eigenspaces for Wp. 
We call the resulting spaces and Sjj respectively. Similarly, let Mjj and Mjj 
be the +1 and —1 eigenspaces for Wp on the full spaces of new modular forms 
of weight k for Eo{y). 

It follows from [AL70, Lem. 7] (which is an explicit formula for the trace to 
lower level) and the fact that Up and Wp both preserve the new subspace, that 
the action of the Hecke operator Up on Sk is given by the formula 

Up = 

This gives rise to two quotients of the Hecke algebra: 

T+ = T“"'/(C/p + p('=-2)/2) and = T“'^/([/p - y^’^~^U2y 

where T+ and act on 5+ and S~ , respectively. Recall that T is the normal- 
ization (integral closure) of T in T ® Q. Let denote the integral closure of 

rjinew rpnew ^ 



Lemma 1. There are injections 
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We now begin stating some conjectures regarding the rings T^. 

Conjecture 3. Let k < p—l. Then T+ and T~ are integrally closed. Equivalently, 
all congruences between distinct eigenforms in Sk^Zp) take place between + and 
— eigenforms. 

Note that for k = 2, there cannot be any congruences between + and — forms 
because this would force 1 = — 1 mod p, which is false, because p is odd. Thus 
we recover the conjecture that p | [T : T] when k = 2. Our further conjectures 
go on to describe explicitly the congruences between forms in and Sjj . 

Let i?2 be the non-holomorphic Eisenstein series of weight 2. The g-expansion 
of i?2 is given explicitly by 



E2 = l-24f;g"(^d). 

n—1 J 

Moreover, the function EJ = E 2 {t) — pE 2 {pr) is holomorphic of weight 2 and 
level Eq(p), and moreover on g-expansions, Etj = E 2 mod p. 

Lemma 2. Let p > 3. Let f G Mfe(/o(p),Fp) be a Hecke eigenform. Then Of is 
an eigenform inside 5'fc+2(/o(p), Fp). 

Proof. One knows that df = Of — kE 2 f /V2 is of weight fc + 2. On ^-expansions, 
E 2 = E 2 mod p, and thus for p > 3, 

Of = df + kEy /12 (modp) 

is the reduction of a weight k + 2 form of level p. It is easy to see that Of is a 
cuspidal Hecke eigenform. 

Let us now assume Conjecture 3 and consider the implications for A: = 4 in 
more detail. The space of modular forms M2(/o(p),Fp) consists precisely of S 2 
and the Eisenstein series E|. The map 0 defined above induces maps: 

0 : S+ (Fp) ^ ^4 (Fp) , 0 : M2- (Fp) ^ ^4 (Fp) . 

The images are distinct, since Of = Og implies (with some care about Op) that 
f = 9- 

Conjecture j. Let / € S'2(Zp) and g € S'4(Zp) be two eigenforms such that 
Of = g mod p. Then the eigenvalue of Wp on / and g have opposite signs. 

Assuming this, we get inclusions: 

0S+ (Fp) 54- (Fp) , OM 2 (Fp) 54+ (Fp) . 



Now we are ready to state our main conjecture: 
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Conjecture 5. There is an Hecke equivariant exact sequence: 

0 ^ S^{Fp) . St(Fp) eM^{Fp) . 0 . 

Moreover, the map 5'^(Fp) — >• S'^(Fp) here is the largest such equivariant map 
between these spaces. Equivalently, a residual eigenform of weight 4 and level p 
occurs in both the + and — spaces if and only if it is not in the image of 9. 



Let us give some consequences of our conjectures for the index of T"®"" inside 
its normalisation. Fix a residual representation p : Gal(Q/Q) — >■ GL2(F,) and 
consider the associated maximal ideal m inside T 4 . If p lies in the image of 6 
then our conjecture implies that it is not congruent to any other eigenform. If p 
is not in the image of 9, then it should arise exactly from a pair of eigenforms, 
one inside (Qp) and one inside 54 (Qp). Suppose that q = p^ . If there is no 
ramification in T G Q over p (this is often true), then the + and — eigenforms 
will both be defined over the ring W (Fg) of Witt vectors of Fq. Since Up = p on 
and —p on , these forms can be at most congruent modulo p. Thus the 
completed Hecke algebra (T 4 )m is exactly 

{(a, &) G IF(Fq) © IT(Fq), |a = 6 mod p}. 

One sees that this has index q = p^ inside its normalisation. Thus the (log of 
the) total index is equal to ^ over all eigenforms that occur inside and 
, which from our exact sequence we see is equal to 

dim S'j" — dim . 



Gonjecture 1 when fc = 4, would then follow from the equality of dimensions: 



dim S'4 (Fp) — dim (Fp) 




We expect that something similar, but a little more complicated, should 
happen in general. In weight 2k, there are mod congruences exactly between 
forms in the image of 0’’“^ but not of 6*’’. 



5.1 Examples 

We write small s’s and m’s for dimensions below. 

Let p = 101. Then sj = 1, = 7 + 1 = 8, S4 = 9, S4 =16. We predict 

the index should be 9 — 1 = 8 = [101/12J. In the table below, we show the 
characteristic polynomials of T2 on S'jf and , and for weight 2, we take the 
characteristic polynomial of 9 T 2 (or the same, taking F{x/2) where F{x) is the 
characteristic polynomial of T2). Note that we have to add the Eisenstein series, 
which has characteristic polynomial x — 1 — 2, which becomes x — 6 = x + 95 
mod 101 under 9. 
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Factors of the Characteristic Polynomial of T 2 for p = 101. 



0^2+(FiOi) 


>5'4 (Fioi) 


54+(Fioi) 


0M2-(Fioi) 


(x) 


(x) 


(x + 46) 


(x + 95) 




{x + 46) 


(x + 95) 


(x2 + 90x + 78) 




(a;2 + 58a; + 100) 


(x^ + 58x + 100) 


(x2 + 96x + 36) 




(x® + 2x^ + 27x3 


(x2 + 90x + 78) 


(x3 + 16x2 




+49x^ + 7x + 65) 


(x^ + 96x + 36) 

(x3 + 16x2 + 35x + 72) 
(x3 + 2x4 + 27x3 
+49x2 + 7x + 65) 


+35x + 72) 



Here are some further conjectures when fc > 4. 

Conjecture 6. Let p and k be such that 4 < k < p — 1. There is an Hecke 
equivariant exact sequence: 

0 . 05+_2(Fp) ^ S-(Fp) S+(Fp) eS-_^(Fp) . 0. 



Moreover, all forms not in the image of 9 contribute maximally to the index (a 
factor of Thus the total index should be equal to 



2 - dim S'fe.a) 



the index at level p and weight k — 2. 



This is the sum 



(2n - 2) 



i^tn •®2n-2)- 



When A: = 4, we need to add the Eisenstein series to S 2 in our previous 
conjecture. Note that — s^_2 = sjj — for fc > 4 (and with replaced 
by when k = 2). This follows from our conjectures, but can easily be proved 
directly. As an example, when p = 101, we have sj = 1, sjj = 9, s(|" = 17, 
= 26, sj^g = 34, 3^2 = 42, = 51, and so we would predict the indexes Ik 

to be as given in the following table: 



k 


4? 


2 


0 


4 


8 = 8 + 0 


6 


24 = 24 + 0 


8 


51 = 48 + 3 


10 


83 = 80 + 3 


12 


123 = 120 + 3 


14 


177= 168 + 9 



This agrees with our conjectural formula, which says that the index should 
be equal in this case to 




it also agrees with computation. 
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Abstract. Using powerful tools on genus 2 curves like the Kummer 
variety, we generalize the Montgomery method for scalar multiplication 
to the Jacobian of these curves. Previously this method was only known 
for elliptic curves. We obtain an algorithm that is competitive compared 
to the usual methods of scalar multiplication and that has additional 
properties such as resistance to timings attacks. This algorithm has very 
important applications in cryptography using hyperelliptic curves and 
more particularly for people interested in cryptography on smart cards. 



1 Introduction 

In this paper we are dealing with the scalar multiplication on the Jacobian of 
curves defined over a field of large characteristic. One of the motivations is that 
this operation is the main part in cryptography based on the Jacobian of curves 
which is becoming more and more popular. For elliptic curves, Montgomery [24] 
developed a method for certain curves (which are said to be in the Montgomery 
form) which allows faster scalar multiplication than the usual methods of expo- 
nentiation for groups. This method has the extra advantage that it is resistant to 
side-channel attacks which is very interesting for people who want to use elliptic 
curves in cryptography on smart cards. The aim of this paper is the general- 
ization of this method to genus 2 curves. In the following, K will denote a field 
of characteristic k > 7. For cryptographic application, the base field we have in 
mind is IFp where p is prime. 

2 The Montgomery Method for Scalar Multiplication on 
Elliptic Curves 

Let E be an elliptic curve defined over K by the equation 

+ a^x + 06 . 

Every elliptic curve defined over K is isomorphic to a curve given by such an 
equation which is called the short Weierstrass form. The set E(K) of the points 
P = (x,y) verifying this equation with x and y in K, forms (together with the 
point at infinity) a group which will be denoted additively. The problem we are 
interested in is the following : 

D.A. Buell (Ed.): ANTS 2004, LNCS 3076, pp. 153-168, 2004. 
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Scalar multiplication 

Given a point P G E{K) and an integer n, compute nP as fast as possible. 

Of course there are a lot of very old methods to do this, such as the classical 
double and add algorithm and its variants (like the sliding window method). To 
improve these algorithms one can choose other systems of coordinates (i.e. other 
means to represent points on the curve) [4]. For example, the best-known coor- 
dinates are projective ones. They are obtained by introducing a new coordinate, 
usually called Z, which is the 1cm of the denominators of x and y. Of course, x 
and y are replaced by X and Y such that x = X/Z and y = YjZ. This choice 
of coordinates allows to avoid inversions, which are very costly operations. In 
the following, we will work with projective coordinates for reasons of efficiency. 
Nevertheless, the same work can be done with other systems of coordinates. The 
algorithm we will present here is slightly different from the usual exponentia- 
tion algorithms in the sense that the purpose is not to minimize the number of 
operations but rather the cost of each operation. 



2.1 The Algorithm 

The original idea of Montgomery [24] was to avoid the computation of the y- 
coordinate, so that one can hope that basic operations (doubling and adding) 
are easier to compute. Since for any x-coordinate, there are two corresponding 
points on the curve (x, y) and (x, — y), this restriction is equivalent to identifying 
a point on the curve and its opposite. When trying to add two points ±P and 
±Q, one cannot decide if the result obtained is ±(P -I- Q) as required or ±(P — 
Q). Nevertheless, some operations remain possible like doubling since it is not 
difficult to decide if the result is ±2P or the point at infinity. Unfortunately, 
doubling is not sufficient for a complete scalar multiplication: one really needs 
to perform some additions. In fact additions are possible if the difference P — Q 
is known. The principle of the computation of nP is to use pairs of consecutive 
powers of P, so that the difference between the two components of the pair is 
always known and equals to P. The algorithm for scalar multiplication is as 
follows: 

Algorithm 1. Montgomery scalar multiplication algorithm on elliptic curves 
Input : P G E (K) and n G Z. 

Output : X and z-coordinate ofnP. 

Step 1. Initialize Q = (Qi^Qz) = {0,P) where O is the point at infinity. 
Step 2. If the bit of n is 0, Q = {2Qi,Qi + Q 2 ). 

Step 3. If the bit ofnisl,Q = {Qi + Q2i2Q2). 

Step 4- After doing that for each bit of n, return Q\. 

In fact, at each step, Q = (/cP, (fc -I- I)P) for some k and we compute either 
(2fcP, (2fc -I- I)P) or {{2k + I)P, (2fc -|- 2)P) in the following step, so that we 
always have Q 2 — Qi = P- 
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Let us note that contrary to double and add or sliding window methods, both 
an addition and a doubling are done for each bit of the exponent. It is the price 
to be paid to avoid the computation of the y-coordinate but we hope that the 
gain obtained thanks to this restriction will be sufficient to compensate for the 
large number of operations. That is the reason why the Montgomery form for 
elliptic curves has been introduced. The interested reader will find more details 
for this section in [24] or [26]. 

2.2 The Montgomery Form 

An elliptic curve E is transformable into the Montgomery form if it is isomorphic 
to a curve given by an equation of the type 

Am : By^ = + Ax^ + x . 

It is easy to prove ([26]) that E is transformable in the Montgomery form if and 
only if 

— the polynomial x^ + a^x + has at least one root a in K, 

— the number 3a^ + 04 is a square in K. 

Thus all elliptic curves are not transformable into the Montgomery form. Nev- 
ertheless, since the two coefficients can be chosen arbitrarily in K, the number 
of curves in such a form is of the same order as for general elliptic curves (for 
example 0{p^) if K = Fp). 

Please note that the first condition means that there is at least one 2-torsion 
point on the curve E, so that the cardinality of the curve is even. 

2.3 Formulas for Doubling and Adding 

Let us now describe the arithmetic of curves in the Montgomery form. 

Proposition 1. Let K. be a field of characteristic /c yf 2, 3 and let Em be an 
elliptic curve defined over K in the Montgomery form. Let P = (Ap, Tp, Zp) and 
Q = {Xq,Yq, Zq) € Em(K) bc givcn in projective coordinates. Assume that the 
difference P — Q = (x,y) is known in affine coordinates. Then we obtain the X 
and Z -coordinates for P Q and 2P by the following formulas : 

Xp+q = ((Ap - Zp){Xq + Zq) + (Ap + Zp)(A, - Zq))'^ , 

Zp-\-q = X ((Ap — Zp){Xq Zq) — (Ap -|- Zp){Xq — Zq)) , 

4ApZp = (Ap + Zp)2 - (Ap - Zp)2 , 

X2p = {Xp + Zp)\Xp-Zp)^ , 

Z2p = 4XpZp (^{Xp - Zp)2 + ^^4ApZp^ . 
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In this way, both an addition and a doubling take 3 multiplications and 2 squares 
each so that the cost of this algorithm is about 10 |n |2 multiplications where |n |2 
denotes the number of bits of n. 

In the best case with usual scalar multiplication, one needs 4 multiplications 
and 4 squaring just for doubling and more for adding. Thus, for curves in the 
Montgomery form, this method is interesting. In practice, the gain obtained is 
about 10 percent (compared in [6] for 192 bits with a sliding window method of 
size 4 after a Koyama-Tsuruoka recoding [18] and using mixed Jacobian modified 
coordinates [4]). 

2.4 General Case 

We are now interested in a Montgomery method for scalar multiplication for 
elliptic curves which cannot be transformed into the Montgomery form. In fact 
the method for scalar multiplication is the same, we just need to have formulas 
for doubling and adding. These formulas can be found in [1], [10] or [13]. Let us 
recall them. 

Proposition 2. Let K. he a field of characteristic /c 2, 3 and let E he an 
elliptic curve defined over K as described in Sect. 2. Let P = (Xp^Yp, Zp) and 
Q = {Xq,Yq, Zq) G E(K) bc given in projective coordinates. Assume that the 
difference P — Q = (x,y) is known in affine coordinates. Then we obtain the X 
and Z -coordinates for P Q and 2P by the following formulas : 

Xp+q = —^ae^pZqiXpZq + XqZp) + {XpXq — a^ZpZq)^ , 

Ep+q — xl^XpZq XqZp^ , 

^2p = {Xp — OiZp^ — SaeXpZp , 

Z2p = 4Zp (^Xp + U4XpZp + a^Zp^ . 

Addition can be evaluated in 10 multiplications and doubling in 9. Thus, in this 
way, the scalar multiplication can be performed in about 19|n|2 multiplications 
in the base field. This method can even be optimized in most cases to 17|n|2 mul- 
tiplications [7]. Of course, in this case, the algorithm is not interesting any more 
compared with the usual methods. However, it can be useful in some situations 
as we will see in the next section. 

2.5 Use and Interest in Cryptography 

In this section, we are dealing with elliptic curve cryptography. Elliptic curve 
cryptosystems were simultaneously introduced by Koblitz [14] and Miller [23]. 
They are becoming more and more popular because the key length can be 
chosen smaller than with RSA cryptosystems for the same level of security. 
This small key size is especially attractive for small cryptographic devices like 
smart cards. In all schemes (such as encryption/decryption or signature gener- 
ation/verification) the dominant operation is the scalar multiplication of some 
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point on the elliptic curve. Hence, the efficiency of this scalar multiplication 
is central in elliptic curve cryptography, and more generally in cryptography 
based on the discrete logarithm problem. In the case where the curve is in the 
Montgomery form, we saw in the previous sections that the Montgomery scalar 
multiplication method allows to compute the multiple of any given point on the 
curve faster than with the usual scalar multiplication algorithms. Unfortunately, 
we also saw that some elliptic curves cannot be transformed into the Mont- 
gomery form. This is for example the case for most of the standards. The reason 
is really simple: any curve which can be transformed in the Montgomery form 
has a 2-torsion point so that its cardinality is divisible by 2 and this is not ideal 
for cryptographic use since we prefer to use curves with prime order. 

In the general case, the Montgomery method can also be applied but is much 
more time-consuming. Indeed, we need to perform both an addition and a dou- 
bling for each bit of the exponent. This is not the case for example in the classical 
double and add algorithm where we only have to perform an addition every two 
bits on average (and even fewer with the sliding window method). Nevertheless, 
this particularity allows to resist to side-channel attacks on smart cards which 
is not the case with other algorithms. 

This kind of attacks uses observations like timings [16], power consumption [17] 
or electromagnetic radiation [28]. They are based on the fact that addition and 
doubling are two different operations. In this situation, it is easy to decide, for 
each bit of the exponent, if the algorithm (double and add for example) is per- 
forming either a doubling (if the bit is 0) or a doubling and an addition (if the 
bit is 1). Hence, it is easy to recover the whole exponent (which was the secret 
key). Of course, various countermeasures have been proposed to secure the el- 
liptic curve scalar multiplication against side-channel attacks [5]. For example, 
if one wants to protect a double and add algorithm, one can perform extra, use- 
less, additions when the bit of the exponent is 0. In this way, for each bit of 
the exponent we perform both an addition and a doubling so that bits of the 
exponent are indistinguishable, but this is of course time consuming. 

With the Montgomery scalar multiplication method, we always have to perform 
both an addition and a doubling for each bit of the exponent, so that this method 
is resistant against side-channel attacks. Therefore it is always interesting even 
with 19 multiplications at each step for general curves. 

Of course elliptic curves in the Montgomery form are very attractive for people 
interested in elliptic curve cryptosystems on smart cards since, on the one hand, 
the scalar multiplication method is the most efficient one known to date and, on 
the other hand, it is resistant to side-channel attacks. That is one of the reasons 
why we want to generalize this method to hyperelliptic curves of genus 2. 
Finally, for some cryptosystems, the x-coordinate of nP is sufficient but others, 
like the elliptic curve signature scheme ECDSA, require the ^-coordinates. To 
recover it, we use the following result from [27] in the case of a curve in the 
Montgomery form. 
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Proposition 3. Suppose that R = P + Q with P = (xi,yi), Q = {x',y') and 
R= {x'^,y'^). Then, if yi ^ 0, one has 

ix'xi + l)(x' + x\ + 2A) — 2A— (x' — xiYx'^ 

" = ■ 

For general curves, it is also possible to recover the y-coordinate ([1]). 

In order to generalize this method to genus 2 curves, let us first recall 
some lowbrow background on these curves. 

3 Background on Genus 2 Curves 

First, let us note that every genus 2 curve is hyperelliptic, so that, in the follow- 
ing we will not state that the curves we are interested in are hyperelliptic. 
Moreover, we will concentrate on imaginary hyperelliptic curves. Since the char- 
acteristic of the field K has been chosen different from 2 and 5, the hyperelliptic 
curves we are interested in are given by an equation of the form 

C : y^ = f{x) =x^ + fsx^ + f 2 X^ + fix + fo with /o, /i, / 2 , /a G K . (1) 

Contrary to elliptic curves, the set of points on genus 2 curves does not form 
a group. Thus, one can define the Jacobian of C, denoted J{C) which is the 
quotient of the group of divisors of degree 0 by the principal divisors. In the case 
of elliptic curves, this Jacobian is isomorphic to the curve itself. More details on 
the definition of the Jacobian can be found in [15]. Our purpose in this paper is 
to give an algorithm for scalar multiplication in this Jacobian. There are mainly 
two means to represent elements in the Jacobian. The first one is a consequence 
of the Riemann-Roch theorem and says that a divisor class can be represented 
by a couple of points (Pi = (xi,yi) and P 2 = ( 0 : 2 , 2 / 2 )) on the curve which are 
conjugated over some quadratic extension of the base field K. The second one 
makes explicit the correspondence of ideal classes and divisor classes and was 
introduced by Mumford [25]: 

Theorem 1. Let the function field be given via the irreducible polynomial y^ = 
f{x) where / G K[x] and deg{f)=5. Each non trivial ideal class over K can be 
represented via a unique ideal generated by u{x) and y — v{x), u,v € K[a;] , where 
u is monic, deg{v) <deg{u) < 2 and u\{v'^ — /). 

The correspondence between these representations is that u{x) = (x—xi)(x—X 2 ) 
and v(xi) = yi with appropriate multiplicities. This Mumford representation 
was used by Cantor to develop his algorithm to compute the group law on 
Jacobians of curves [2]. Several researchers such as Harley [12], or more recently 
Lange [19], [20], [21] made explicit the steps of Cantor’s Algorithm and list the 
operations one really needs to perform. They obtained explicit formulas for the 
group law on the Jacobian. 
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The basic objects are now defined and we can give an analog for genus 2 
curves of the Montgomery form for elliptic curves. 

4 A Montgomery-Like Form for Genus 2 Curves 

4.1 Definition 

In the following, we will say that a curve C is transformable into Montgomery-like 
form if it is isomorphic to a curve given by an equation of the type 

By^ = -\- f4x'^ + + f2X^ -\- X . (2) 

It is easy to prove that a curve C as defined in Sect. 3 is transformable into 
Montgomery-like form if and only if 

— the polynomial f{x) has at least one root a in K. 

— the number f'{a) is a fourth power in K. 

Thus, as in the case of elliptic curves, not all the curves are transformable into 
the Montgomery-like form. Nevertheless, since the four coefficients can be chosen 
arbitrarily in K, the number of such curves is about the same as for general genus 
2 curves {0{p'^) if K = Fp). 

Please note that the first condition means that there is at least one 2-torsion 
element in the Jacobian variety of the curve C, so that the cardinality of the 
Jacobian is even. 



4.2 The Kummer Surface 

With elliptic curves, the main idea of the Montgomery method was to avoid the 
computation of the y-coordinate. At first sight an analog for genus 2 curves would 
be to avoid the computation of the polynomial v in the Mumford representation 
and keep only u. But this is not satisfying since it has no mathematical sense. 
In fact, with elliptic curves, working only with the x-coordinate means that we 
identify a point and its opposite. The analog for genus 2 curves is called the 
Kummer surface, where a divisor and its opposite are identified. The Kummer 
surface is a quartic surface in F^. We give here the definition of the Kummer 
surface and its properties without demonstrations for curves in Montgomery-like 
form. They were obtained using the same method as that in the book of Cassels 
and Flynn on genus 2 curves ([3] or [8]). The Kummer surface is the image of 
the map 



K : J(K) 

{{xi,yi), (x 2 ,y 2 )} 



F^(K) 



1, Xi + X2, XiX2, 



Fq{xi,X 2) - 2Byiy2 \ 
(Xi-X2)^ ) 
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with 

Fo(a:i, X2) = {xi + X2) + 2/2X1X2 + fsixi + X 2 )xiX 2 + 2/4X1 + (xi + X 2 )xiX 2 ■ 

In the following, for any divisor A G J(C)), we will denote 

k{A) = {ki{A),k2{A),kd.{-A),ki{-A)) ■ 

More precisely, the Kummer surface is the projective locus given by an equation 
K of degree four in the first three variables and of degree two in the last one. 
The exact equation can be found online [9]. In passing from the Jacobian to the 
Kummer surface, we have lost the group structure (as was already the case with 
elliptic curves) but traces of it remain. For example, it is possible to double on 
the Kummer surface. 

Nevertheless, for general divisors A and B, we cannot determine the values of 
the ki{A + B) from the values of the ki{A) and ki{B) since the latter do not 
distinguish between and ±,B, and so not between However the values 

of the ki{A + B)kj{A — B) + ki{A — B)kj{A + B) are well determined. We have 

Theorem 2. There are explicit polynomials (pij biquadratic in the ki{A),ki{B) 
such that projectively 

h{A + B)kj{A- B) + h{A- B)kj{A + B) = <p^j{A,B) . (3) 

Using these biquadratic forms, we can easily compute the ki{A+B) if the ki{A — 
B) are known. We can also compute the ki{2A) by puting A = B. 



4.3 Formulas for Adding 

Proposition 4. Let K. be a field of characteristic /c yf 2, 3 and let C be a curve 
of genus 2 defined over K in the Montgomery form as defined in Sect. /. Let tC 
denote the Kummer surface of C. Let A and B be two divisors on the jacobian of 
C, k{A) = {ki{A),k2{A),k3{A),k4{A)) and k{B) = {ki{B) ,k 2 {B) Ai{B)) 
their images in the Kummer surface. Assume that the difference A — B is known 
and that the last coordinate of its image in the Kummer surface is one (remember 
we are in P^(K) ). Then we obtain the Kummer coordinates for A + B by the 
following formulas : 

kfiA + B) = q,n{A,B) , 

k2{A + B) = 2 ((pi2(M, B) -ki{A + B)k2{A - B)) , 

k3{A + B) = kM-B)v33{A,B) , 

k4{A + B) = 2{<pi4{A, B) -ki{A + B)k4{A - B)) , 



where the ipij are the biquadratic forms described in Sect. 4-2. 
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The expressions of the ipij{A+B) are available by anonymous ftp [9] but require a 
large number of operations in the base field to be computed. The main difficulty 
is to find expressions which require the least possible multiplications in K. We 
now give more precisely these expressions for the ipij we are interested in. For 
clarity we will denote k{A) = (fci, ^ 2 , ^ 3 , ^ 4 ) and k{B) = (hjhihih)- 

^ii{AtB) = {{kAi — kili) + (^2^3 — ^ 3 ^ 2 ))^ j 

'^12{AtB) = ((^2^3 + ^ 3 ^ 2 ) + (fcl^4 + fc4^l))(/3(^1^3 + kAl) + (^2^4 + ^ 4 ^ 2 )) + 

2 {k\l^ + kAi){f 2 {kilz + kAi) + {k\l 2 + ^ 2 ^ 1 ) ~ (^ 3^4 + kAz)) + 

2/4(^1^4 + ^4^l)(fc2^3 + ^ 3 ^ 2 ) 5 

‘/533(-4, .S) = {{kAi — kAz) + {k\l2 — k2li)Y , 

i^xAAtB) = {kill — kzlz){fA{kAi + kAi) — (^2^3 + kzh)) + 

2 ((fci ^2 + ^2^1) ~ (^ 3^4 + kAz)) + 
f2{kAA + ^2^2) + 2/4(fci^i — kzlz)) + 

(^2^2 ~ ^ 4 ^ 4 )((^ 2^3 + kzh) ~ {kAi + ^4^1) ~ f2{kAl + kzh)) ■ 



4.4 Formulas for Doubling 

Proposition 5. Let K. be a field of characteristic fc yf 2, 3 and let C he a curve 
of genus 2 defined over K in the Montgomery form as defined in Sect. 4- Let K. 
denote the Kummer surface of C. Let also A he a divisor on the jacohian of C 
and k{A) = {ki,k 2 , kz, kA its image in the Kummer surface. Then we obtain the 
Kummer coordinates for 2.4 (k{2A) = <5i, 1 ^ 2 ; <^ 3 , 5A by the following formulas : 

(5i = 2ifiAA, .4) , 

<52 = 2(p24(A-4) + 2/3iF(.4) , 

5z = 2(pzi{A, A) , 

5i = g}ii{A,A)+2K{A) , 

where the (pij are the biquadratic forms described in Sect. 4-^ OL^d K is the 
equation of the Kummer surface also described in Sect. 4-2 and such that K{A) = 

0 . 

This is just a consequence of Theorem 2. Let us note that in 82 and ^4 we 
added a multiple of the equation of the Kummer surface in order to simplify the 
expressions as much as possible. We give now more precisely these expressions 
for the 5i. 

(5i = 8(fc^ — kz){fA^i ~ ^ 3 ) + 2(k\k2 — ^ 3 ^ 4 )) + 

8(fcifc4 - k2kz){kl -kl + f2{kiki - ^ 2 ^ 3 ) + faikf - k§)) , 

82 = 8 (ki + fcg — fzkikz — 3k2kA{k2 + — fsiki + k^ + dfcifcs) + 

16(/c2^4 + f3kiklz){fi{kik2 + kzki) + 2(/c| + fc^) + f 2 {kiki + k 2 kz)) + 
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32fci/c3 (4*2^4 + / 2 (fcifc 2 + kzki) + (/I + fi)kik3 + 8f4{kik4 + k2k3)) , 

^3 = 8{k\ — kl){f2{kl — fcg) + 2{kik4 — ^ 2 ^ 3 ) + fs{kik2 — ^ 3 ^ 4 )) + 

8{k3k4 - kik2){kj - + /4(fc3^4 - kik2)) , 

84 = (^2 + k1){{k2 + k1) — 2 f 3 {ki + fcg) — 8^1/03) + 

(ki + k3){f3k\k3 + /4(fci/c4 + ^ 2 ^: 3 ) + 2 /C 2 A 4 + f2{klk2 + ^ 3 ^ 4 ) + 

{fi - ^f2h){kl + kl)) - 8f2f4{klk3)^ . 



5 The Montgomery Scalar Multiplication on Genus 2 
Curves in Montgomery-Like Form 

5.1 Algorithm 

We give here an analog for genus 2 curves of the Montgomery method for scalar 
multiplication on elliptic curves described in Sect. 2.1. In the case of elliptic 
curves, Montgomery’s method [24] avoids the computation of the y-coordinate. 
We saw that an equivalent method in genus 2 was to work on the Kummer 
surface. Of course we have the same restriction in the case of genus 2 curves, 
namely that it is not possible to add two divisors except if their difference is 
known. If T> is some divisor, recalling that our goal is the computation of riD 
for some integer n, the principle is, as it was already the case for elliptic curves, 
to use pairs of consecutive powers of T>, so that the difference between the two 
components of the pair is always known and equal to T>. The algorithm for scalar 
multiplication is as follows: 

Algorithm 2. Montgomery scalar multiplication algorithm for genus 2 curves 
Input : T> ^ J (K) and n € 7Z . 

Output : K{nT>), the image in the Kummer surface of nT>. 

Step 1. Initialize {A^B) = ((0, 0, 0, 1), k(2?)) where (0,0, 0,1) is the image 
in the Kummer surface of the neutral element on J{C). 

Step 2. If the bit of n is 0, (A,B) = (2A, A + yB). 

Step 3. If the hit of n is 1, (A,B) = (A + B,2B). 

Step 4- After doing that for each bit of n, return A. 

Note that, at each step, we always have B — A = k{T>) so that the addition of 
A and B is possible. 



5.2 Number of Operations 

At each step of the algorithm, we perform both an addition and a doubling, 
hence we just have to count the number of operations required for each of them. 
In the following, M will denote a multiplication in K and S a squaring. 
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Table 1. Addition of A and B in IC{C) if M B is known 



expressions 


operations 


precomputations 


16M 




S 




6M 


^AAB) 


S 


ipii{A,B) 


6M 


k{A + B) 


3M 


total 


•HM + 2S 



Remark 1. The 31 multiplications include 7 multiplications by coefficients of the 
curve. 



Table 2. Doubling of M in /C(C) 



expressions 


operations 


precomputations 




i,j=l.A 


QM + AS 


5i{AB) 


5M 


52{A,B) 


IIM 


h{A,B) 


5M 


SAAB) 


5M 


total 


31M + 5S 



Remark 2. The 31 multiplications include 16 multiplications by coefficients of 
the curve. Moreover the multiplications f^kik^, fsikf + fc|), /4(A:iA:4 + ^2^3) and 
72(^1 ^2 + ^3^4) are not counted in 6^ since they were already computed in <52- 
Finally, we of course assumed that 72/4, 7| “47274 and 71+74 were precomputed. 

Hence, on a curve in the Montgomery form as in (2), the scalar multiplication us- 
ing the Montgomery method requires 69|n|2 base field multiplications (assuming 
that a squaring is a multiplication), where |n|2 is the number of bits of n. 

5.3 Comparison with Usual Algorithms for Scalar Multiplication 

To date, the best algorithms for scalar multiplication on genus 2 curves defined 
over a field of odd characteristic are obtained by using mixed weighted projective 
coordinates [21]. In this case, Lange needs 41 multiplications both for a mixed 
addition and for a doubling. Hence our formulas requires fewer base field oper- 
ations. But, in the Montgomery algorithm, we must perform both an addition 
and a doubling for each bit of the exponent whereas one can use efficient algo- 
rithms (like the sliding window method) with Lange’s formulas. Nevertheless, 
this algorithm is still interesting for many reasons. 
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— As was the case for elliptic curves and as explained in Sect. 2.5, the Mont- 
gomery algorithm is resistant to side-channel attacks, contrary to other al- 
gorithms for scalar multiplications. For this reason it will be of interest to 
people who need to implement hyperelliptic curves protocols on smart cards 
or systems sensitive to side-channel attacks. For example, if one wants to 
make safe algorithms using mixed weighted projective coordinates, one needs 
to perform an extra addition when the bit of the exponent is one. In this 
case, for each bit of the exponent, 82 base field operations are required and 
with only 69, our algorithm allows a gain of 16 percent, which is significant. 

— This algorithm is very easy to implement, there are no precomputations 
(as in the sliding window method) and an element on the Kummer surface 
requires only 4 base field elements whereas weighted projective coordinates 
require 8 of them so that it is also interesting in terms of memory usage. 
This last remark will be an advantage for constrained environments like 
smart cards. 

— It is very dependent of the coefficients of the curve. Indeed there are 23 
multiplications by these coefficients but only 2 in Lange’s formulas. Hence a 
good choice of the coefficients of the curve certainly allows better timings. 
This is the subject of the following section. 



5.4 Some Special Cases 

In order to decrease the number of base field operations for our algorithm, certain 
choices of coefficients of the curve are better to use. For example there are 6 
multiplications by /s in the formulas given in Sects. 4.3 and 4.4 so that, if one 
chooses /3 = 0 or 1, the total amount of multiplications necessary for each bit of 
the exponent is 63 instead of 69. In the following table, we summarize the gain 
obtained in each operation. Let us note that there is no gain for the calculation 
of (fn, (f 33 and precomputations. 

Table 3. Gain obtained if . . . 





/2 = 0 


/2 small 


/3 = 0 or small 


/4 = 0 


/4 small 


‘P12 


1 


1 


1 


2 


1 


‘Pli 


2 


2 


1 


1 


1 


5i 


1 


1 


1 


1 


1 


<52 


2 


2 


2 


2 


2 


<53 


1 


1 


1 


1 


1 


<54 


2 


1 


0 


2 


1 


total 


9 


8 


6 


9 


7 



Remark 3. If two of these conditions on the coefficients are satisfied the gain 
obtained is just the sum of the gains. 

Of course this kind of restriction implies that fewer curves are taken into account. 
For example, if K = IFp and /s = 0, one can only choose three coefficients in Fp 
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(namely /2, /r and B) so that the number of such curves is 0{p^). Thus we lose 
in generality. However, in cryptography, one only needs to find a curve such that 
the order of its Jacobian is divisible by a huge prime number. For this, one needs 
enough choices of curves in order to be able to find a curve with this property 
and 0{p^) choices are of course widely sufficient. 

Let us now examine more precisely a particular case and compare our algorithm 
to usual ones. Let C be a genus 2 curve defined over K by an equation of the 
form 

By^ = + ex“^ + x with e = 0 or ±1 and B and /a G K . (4) 

There are 0{p^) curves in this form (which is sufficient to find one of these 
with nice properties for use in cryptography). Here, our algorithm of scalar 
multiplication requires 52 multiplications for each bit of the exponent whereas 
with mixed weighted projective coordinates, 

~ a sliding window method with window size equal to 4 requires in average 48 
multiplications , 

— a classical double and add requires 61 multiplications on average, 

— a side-channel attack resistant double and add requires 81 multiplications. 

Thus, our algorithm is 15 percent faster than a double and add, not so far from 
the sliding window method (around 7 percent) and much more efficient if one 
wants the operation to be resistant to side-channel attacks. Indeed, in this case, 
we obtain a gain of 36 percent. Of course one can even be faster than the sliding 
window method by choosing a small coefficient /s but the number of such curves 
becomes small. 

Remark 4- Another means to accelerate this algorithm would be to choose /2, 
/s and /4 one word long. For example, on a 32 bits processor, if we are working 
on some finite field of cryptographic size for genus 2 curves, a multiplication 
of a coefficient of the curve and an element of the base field is about three 
times faster than the usual multiplication in the base field. Hence, as there are 
23 multiplications by coefficients of the curve, our algorithm will require the 
equivalent of 53 multiplications, which is not so bad. 

5.5 Examples 

In this section, the base field is the prime field F280+13 (so that cryptosystems 
based on genus 2 curves defined over this field have the same security level than 
those based on elliptic curves defined over some 160 bits prime field). Let Ci, C 2 
and C3 be the genus 2 curves respectively defined by the equations 

44294637780422381596577 = x® -k 27193747657561668783534 

-k 29755922098575219239037 
-k 76047862040141126737826 -k a; , 

10377931047456722522292 = x^ + 77304198867988157865677 a;^ -k a;^ -k a; , 

69418837068442493864220 7/^ = x^ + x^ + x . 
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We compared our algorithm on these curves with a sliding window of size 4, a 
classical double and add and a double and always add (used to resist against 
side-channel attacks). For these three algorithms, we of course always used the 
weighted projective coordinates as in [21] which are the more efficient ones. 
In the following table, we provide the timings obtained using GMP 4.1.2 on a 
Pentium IV 3.06 GHz. We carried out 1000 scalar multiplications in each case 
with various divisors and 160 bits exponents. 

Table 4. Timings 





Cl 


^2 


Cs 


Sliding window 


13.4 ms 


13.3 ms 


12.9 ms 


Double and add 


16 ms 


16 ms 


15.5 ms 


Double and always add 


21.5 ms 


21.5 ms 


21 ms 


Montgomery method 


18.3 ms 


13.6 ms 


11.9 ms 



6 Conclusion and Prospects 

Thanks to the theory of the Kummer surface of a hyperelliptic curve of genus 
2, we have generalized to genus 2 curves the method of Montgomery for scalar 
multiplication on elliptic curves. As Montgomery does for elliptic curves, we re- 
strict to curves transformable into Montgomery-like form. However, there are no 
theoretical obstructions to generalize this method to all genus 2 curves. Indeed 
Propositions 4.3 and 4.4 remain valid but the total amount of multiplications to 
compute the biquadratic forms is really huge so that this method is not com- 
petitive with the classical ones. This is not so surprising since it was already the 
case for elliptic curves. 

In fact, for people interested in cryptography, this restriction is not very impor- 
tant since the number of choices of curves remains largely the same. The only 
significant restriction is that the order of the Jacobian of such curves is even and 
then cannot be prime. But working with a Jacobian whose order is twice a prime 
is not less efficient than working with a prime order. 

For elliptic curves, the standards are not transformable into the Montgomery 
form because of this restriction and it’s really a shame because the Montgomery 
method for scalar multiplication is the most interesting one (the fastest, easy to 
implement, resistant to side-channel attacks). Up to now, there are no standards 
for genus 2 curves. If such standards exist one day, it would be useful to take 
the method that we developed into account. 

Moreover, we have seen that, with some restrictions, we obtain very interesting 
timings for the scalar multiplication on the Jacobian of genus 2 curves. It would 
be nice to verify (even if there is no reason for this) that these restrictions are 
not awkward for finding jacobians suitable for cryptography (i.e. with a large 
prime dividing the order). Unfortunately, algorithms for finding the order of the 
Jacobian over Fp are still under development ([11], [22]). 

Finally, it would be very interesting to study the case of the characteristic 2, 
since it is in that case that this method is the most efficient for elliptic curves. 
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For this, all the necessary mathematical objects, such as the Kummer surface, 
remain to be defined. 



References 

1. Brier, E., Joye, M.: Weierstrass Elliptic Curves and Side-Channel Attacks, Public 
Key Cryptography, Lecture Notes in Computer Science, 2274 (2002) 

2. Cantor, D. G.: Computing on the Jacobian of a hyperelliptic curve. Math. Comp., 
48 (1987) 95-101 

3. Cassel, J. W. S., Flynn, E. V.: Prolegomena to a middlebrow arithmetic of curves 
of genus 2, London Mathematical Society Lecture Note Series, 230 (1996) 

4. Cohen, H., Miyaji, A., Ono, T.: Efficient elliptic curve exponentiation using mixed 
coordinates, Asiacrypt’98, Lecture Notes in Computer Science, 1514 (1998) 51-65 

5. Coron, J. S.: Resistance against differential power analysis for elliptic curve cryp- 
tosystems, CHES’99, Lecture Notes in Computer Science, 1717 (1999) 292-302 

6. Doche, C., Duquesne, S.: Manual for the elliptic curve library, Arehcc report (2003) 

7. Duquesne, S.: Improvement of the Montgomery method for general elliptic curves 
defined over IFp, preprint (2003) 

8. Flynn, E. V.: The group law on the Jacobian of a curve of genus 2, J. reine angew. 
Math., 439 (1993), 45-69 

9. Flynn, E. V.: ftp site, ftp://ftp.liv.ac.uk/pub/genus2/kummer 

10. Fischer, W., Giraud, C., Knudsen, E. W., Seifert, J. P.: Parallel scalar multipli- 
cation on general elliptic curves over IFp hedged against Non-Differential Side- 
Channel Attacks, preprint 

11. Gaudry, P., Schost, E.: Construction of secure random curves of genus 2 over prime 
fields, Eurocrypt’04, Lecture Notes in Computer Science (2004) 

12. Harley, R.: Fast arithmetic on genus 2 curves, available at 

http://cristal.inria.fr/~harley/hyper (2000) 

13. Izu, T., Takagi, T.: A fast Elliptic Curve Multiplication Resistant against Side 
Channel Attacks, preprint 

14. Koblitz, N.: Elliptic curve cryptosystems. Math. Comp., 48 (1987) 203-209 

15. Koblitz, N.: Algebraic aspects of cryptography. Algorithms and Computation in 
Mathematics, 3 (1998) 

16. Kocher, P. C.: Timing attacks on implementations of DH, RSA, DSS and other 
systems, CRYPTO’96, Lecture Notes in Computer Science, 1109 (1996), 104-113 

17. Kocher, P. C., Jaffe, J., Jun, B.: Differential power analysis, CRYPTO’99, Lecture 
Notes in Computer Science, 1666 (1999) 388-397 

18. Koyama, K., Tsuruoka, Y.: Speeding up elliptic cryptosystems by using a signed 
binary window method, Crypto’92, Lecture Notes in Computer Science, 740 (1993) 
345-357 

19. Lange, T.: Efficient Arithmetic on Genus 2 Hyperelliptic Curves over Finite Fields 
via Explicit Formulae, Cryptology ePrint Archive, 121 (2002) 

20. Lange, T.: Inversion-Free Arithmetic on Genus 2 Hyperelliptic Curves, Cryptology 
ePrint Archive, 147 (2002) 

21. Lange, T.: Weighted Coordinates on Genus 2 Hyperelliptic Curves, Cryptology 
ePrint Archive, 153 (2002) 

22. Matsuo, K. Chao, J. Tsujii, S.: An improved baby step giant step algorithm for 
point counting of hyperelliptic curves over finite fields, ANTS-V, Lecture Notes in 
Computer Science, 2369 (2002) 461-474 




168 S. Duquesne 



23. Miller, V. S.: Use of elliptic curves in cryptography, Crypto’85, Lecture Notes in 
Computer Science, 218 (1986) 417-426 

24. Montgomery, P. L.: Speeding the Pollard and elliptic curve methods of factoriza- 
tion, Math. Comp., 48 (1987) 243-164 

25. Mumford, D.: Tata lectures on Theta II, Birkhauser (1984) 

26. Okeya, K., Kurumatani, H., Sakurai, K.: Elliptic curves with the Montgomery-form 
and their cryptographic applications. Public Key Cryptography, Lecture Notes in 
Computer Science, 1751 (2000) 238-257 

27. Okeya, O., Sakurai, K.: Efficient Elliptic Curve Cryptosystems from a Scalar Mul- 
tiplication Algorithm with Recovery of the y-Coordinate on a Montgomery-Form 
Elliptic Curve, Cryptographic Hardware and Embedded Systems, Lecture Notes 
in Computer Science, 2162 (2001) 126-141 

28. Quisquater, J. J., Samyde, D.: ElectroMagnetic Analysis (EMA): Measures and 
Countermeasures for Smart Cards, e-smart 2001, Lecture Notes in Computer Sci- 
ence, 2140 , (2001), 200-210 




Improved Weil and Tate Pairings for Elliptic and 
Hyperelliptic Curves 



Kirsten Eisentrager^*, Kristin Lauter^, and Peter L. Montgomery^ 

^ School of Mathematics, Institute for Advanced Study, Einstein Drive, Princeton, 

NJ 08540 eisentra@ias.edu 

^ Microsoft Research, One Microsoft Way, Redmond, WA 98052 
klauterSmicrosoft . com, petmonSmicrosoft . com 



Abstract. We present algorithms for computing the squared Weil and 
Tate pairings on elliptic curves and the squared Tate pairing on hyper- 
elliptic curves. The squared pairings introduced in this paper have the 
advantage that our algorithms for evaluating them are deterministic and 
do not depend on a random choice of points. Our algorithm to evaluate 
the squared Weil pairing is about 20% more efficient than the stan- 
dard Weil pairing. Our algorithm for the squared Tate pairing on elliptic 
curves matches the efficiency of the algorithm given by Barreto, Lynn, 
and Scott in the case of arbitrary base points where their denominator 
cancellation technique does not apply. Our algorithm for the squared 
Tate pairing for hyperelliptic curves is the first detailed implementation 
of the pairing for general hyperelliptic curves of genus 2, and saves an 
estimated 30% over the standard algorithm. 



1 Introduction 

The Weil and Tate pairings have been proposed for use in cryptography, includ- 
ing one-round 3- way key establishment, identity-based encryption, and short 
signatures [9]. For a fixed positive integer m, the Weil pairing is a bilinear 
map that sends two m-torsion points on an elliptic curve to an mth root of unity 
in the field. For elliptic curves, the Weil pairing is a quotient of two applications 
of the Tate pairing, except that the Tate pairing needs an exponentiation which 
the Weil pairing omits. 

For cryptographic applications, the objective is a bilinear map with a specific 
recipe for efficient evaluation, and no clear way to invert. The Weil and Tate 
pairings provide such tools. Each pairing has a practical definition which involves 
finding functions with prescribed zeros and poles on the curve, and evaluating 
those functions at pairs of points. 

For elliptic curves. Miller [10] gave an algorithm for the Weil pairing. (See also 
the Appendix B to [3] , for a probabilistic implementation of Miller’s algorithm 
which recursively generates and evaluates the required functions based on a 

* The research for this paper was done while the first author was visiting Microsoft 
Research. We thank S. Galbraith for constructive suggestions. 

D.A. Buell (Ed.): ANTS 2004, LNCS 3076, pp. 169-183, 2004. 
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random choice of points.) For Jacobians of hyperelliptic curves, Frey and Riick 
[7] gave a recursive algorithm to generate the required functions, assuming the 
knowledge of intermediate functions having prescribed zeros and poles. 

For elliptic curves, we present an improved algorithm for computing the 
squared Weil pairing, em{P,QY- Our deterministic algorithm does not depend 
on a random choice of points for evaluation of the pairing. Our algorithm saves 
about 20% over the standard implementation of the Weil pairing [3] . We use this 
idea to obtain an improved algorithm for computing the squared Tate pairing 
for elliptic and hyperelliptic curves. The Tate pairing is already more efficient to 
implement than the Weil pairing. Our new squared Tate pairing is more efficient 
than Miller’s algorithm for the Tate pairing for elliptic curves, for another 20% 
saving. For pairings on special families of elliptic curves in characteristics 2 
and 3, some implementation improvements were given in [8] and [1]. Another 
deterministic algorithm was given in [1]. In [2], an algorithm for the pairing on 
ordinary elliptic curves in arbitrary characteristic is given. Our squared pairing 
matches the efficiency of the algorithm in [2] in the case of arbitrary base points 
where their denominator cancellation technique does not apply. 

For hyperelliptic curves, we use Cantor’s algorithm to produce the interme- 
diate functions assumed by Frey and Riick. We define a squared Tate pairing for 
hyperelliptic curves, and use the knowledge of these intermediate functions to 
implement the pairing and give an example. Our analysis shows that using the 
squared Tate pairing saves roughly 30% over the standard Tate pairing for genus 
2 curves. Our algorithm for the pairing on hyperelliptic curves can be thought 
of as a partial generalization of the Barreto-Lynn-Scott algorithm for elliptic 
curves in the sense that we give a deterministic algorithm which is more efficient 
to evaluate than the standard one. It remains to be seen whether some denom- 
inator cancellation can also be achieved in the hyperelliptic case by choosing 
base points of a special form as was done for elliptic curves in [2] . For a special 
family of hyperelliptic curves, Duursma and Lee have given a closed formula for 
the pairing in [5], but ours is the first algorithm for the Tate pairing on general 
hyperelliptic curves, and we have implemented the genus 2 case. The squared 
Weil pairing or the squared Tate pairing can be substituted for the Weil or Tate 
pairing in many of the above cryptographic applications. 

The paper is organized as follows. Section 2 provides background on the Weil 
pairing for elliptic curves and gives the algorithm for computing the squared Weil 
pairing. Section 3 does the same for the squared Tate pairing for elliptic curves. 
Section 4 presents the squared Tate pairing for hyperelliptic curves and shows 
how to implement it. Section 5 gives an example of the hyperelliptic pairing. 

2 Weil Pairings for Elliptic Curves 

2.1 Definition of the Weil Pairing 

Let E be an elliptic curve over a finite field F^. In the following O denotes the 
point at infinity on E. If P is a point on E, then x{P) and y{P) denote the 
rational functions mapping P to its affine x- and y-coordinates. 
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Let m be a positive integer. We will use the Weil pairing em(-,-) definition 
in [11, p. 107]. To compute em{P,Q), given two distinct m-torsion points P 
and Q on E over an extension field, pick two divisors Ap and Aq which are 
equivalent to (P) — (O) and (Q) — (O), respectively, and such that Ap and Aq 
have disjoint support. Let /^p be a function on E whose divisor of zeros and 
poles is {Jap) = m - Ap. Similarly, let Jaq be a function on E whose divisor of 
zeros and poles is {/aq) = 'm- ■ Aq. Then 



em{P,Q) 



Iap (^q) 
Iaq {Ap) 



2.2 Rational Functions Needed in the Evaluation of the Pairing 

Fix an integer m > 0 and an m-torsion point P on an elliptic curve E. Let Ap 
be a divisor equivalent to (P) — (O). For a positive integer j, let fj,Ap be a 
rational function on E with divisor 



{fpAp)=jAp-{jP) + { 0 ) 



This means that fj,Ap has j-fold zeros and poles at the points in Ap, as well 
as a simple pole at jP and a simple zero at O, and no other zeros or poles. 
Since mP = O, it follows that fm,Ap has divisor mAp, so in fact /ap = fm,Ap- 
Throughout the paper the notation fjp will be used to denote the function fj^Ap 
with Ap = {P) — (O). 

Silverman [11, Cor. 3.5, p. 67] shows that these functions exist. Each fi^Ap is 
unique up to a nonzero multiplicative scalar. Miller’s algorithm gives an iterative 
construction of these functions (see for example [1]). The construction of fi^Ap 
depends on Ap. Given fi^Ap &nd fj^Apj constructs fi+j^Ap ^s the product 



fi+j,Ap 



fi,Ap ' fj,Ap 



9iP,jP 

9{i+j)p 



( 1 ) 



Here the notation guy (two subscripts) denotes the line passing through the 
points U and V on E. The notation gu (one subscript) denotes the vertical line 
through U and —U. For more details on efficiently computing fm.Api see [6]. 



2.3 Squared Weil Pairing for Elliptic Curves 

The purpose of this section is to construct a new pairing, which we call the 
‘squared Weil pairing’, and which has the advantage of being more efficient to 
compute than Miller’s algorithm for the original Weil pairing. Our algorithm also 
has the advantage that it is guaranteed to output the correct answer and does 
not depend on inputting a randomly chosen point. In contrast Miller’s algorithm 
may restart, since the randomly chosen point can cause the algorithm to fail. 
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2.4 Algorithm for em{P,Q)^ 

Fix a positive integer m and the curve E. Given two m-torsion points P and Q 
on E, we want to compute em{P,Q)^- Start with an addition-subtraction chain 
for m. That is, after an initial 1, every element in the chain is a sum or difference 
of two earlier elements, until an m appears. Well-known techniques give a chain 
of length 0(log(m)). For each j in the addition-subtraction chain, form a tuple 
tj = [jP, jQ, Uj, dj] such that 

^ ^ fj,p{Q) fj,Q{-P) (2) 

d, hp{-Q)hQ{P)' 

Start with ti = [P, Q, 1, 1]. Given tj and tk, this procedure gets tj+k- 

1. Form the elliptic curve sums jP + kP = {j + k)P and jQ + kQ = (j -|- k)Q. 

2. Find coefficients of the line gjp^kp{X) = cq + c\x{X) + C 2 y{X). 

3. Find coefficients of the line 9 jQ,kQ{X) = c'q + c'ix{X) + C 2 y{X). 

4. Set 

Uj+k = rijUkico + cix{Q) + C2y{Q)) (cg -I- c'ix{P) - c'^yiP)) 
dj+k = djdkico + cix{Q) - C2y{Q)) (cg -f c[x{P) + c'^yiP)). 

A similar construction gives tj-k from tj and tk- The vertical lines through 
(j -I- k)P and (j -I- k)Q do not appear in the formulae for rij+k and dj+k, because 
the contributions from Q and —Q (or from P and —P) are equal. When j+k = m, 
this simplifies to rij+k = rijUk and dj+k = djdk, since C 2 and C 2 will be zero. 
When Um and dm are nonzero, then the computation 

P-m fm^P {Q) fraM-P) 

dm fm,p{-Q) fm,Q{P) 

has been successful, and we have the correct output. If, however, or dm is 
zero, then some factor such as cq + cix{Q) + C 2 y{Q) must have vanished. That 
line was chosen to pass through jP, kP, and {—j — k)P, for some j and k. It 
does not vanish at any other point on the elliptic curve. Therefore this factor 
can vanish only if Q = jP or Q = kP or Q = {—j — k)P. In all of these cases Q 
will be a multiple of P, ensuring em{P, Q) = 1- 



2.5 Correctness Proof 



Theorem 1 (Squared Weil Pairing Formula). Let m be a positive integer. 
Suppose P and Q are m-torsion points on E, with neither being the identity and 
P not equal to ±Q. Then the squared Weil pairing satisfies 



fmAQ) ■ fm,Q{-P) 

fm,p{-Q) ■ fm,Q(P) 



i-irem{P,Qf- 
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Proof. Let i?i, i?2 be points on E such that the divisors Ap := {P + Ri) — (i?i) 
and Aq := {Q + R2) — {R2) have disjoint support. Let A-q := {—Q + R2) — {R2)- 
Let Jap and Jaq be as above. Then 

(p _ Iap {{Q + R2) — {R2)) _ fApjQ + R2) Jaq (Ri) 

fAQ{{P + Ri)-{Ri))~ fAp{R2) ' fA^iP + RiY 

Let g{X) = fm,p{X - Ri). Then (^f) = m{P + Ri) - m{Ri) = mAp = (Jap), 
This implies g{X)/ fAp{X) is constant and 

fAp{Q + R2) _ giQ + R2) _ fm,p{Q + R2 — Ri) 
fAp{R2) ~ 9{R2) ~ fm,p{R2 - Rl) ■ 



Similarly 



fA^iRl) ^ fm,Q{Rl - R 2 ) 

fAc^iP + Ri) fp^,Q{P + Rl-R 2 y 

Plugging these into Miller’s formula gives 

(p _ fm,p{Q + R2 - Rl) fm,Q{Rl - R2) 

fmAR2-Rl) fm,Q{P+Rl-R2)' 

Using the same argument for em(P, —Q) we obtain 

(p -o) — A,pj.-Q + R2- Rl) fm-Q{Ri - R2) 

AAR2-R1) fm,-Q(P + Rl - R2) 

_ fm,p{ — Q + R 2 — Rl) fm,Q{ — Rl + R 2 ) 

fm,p{R2 — Rl) fm,Qi — P—Rl + R2) 

Hence we can simplify em{P,Q)^ to 

&m{P^ Q) _ fm,p{Q + R2 — Rl) fm,Q(Rl ~ R2) fm,Q{ — P ~ Rl + R2) 
em{P, —Q) fm,p{ — Q + R2 — Rl) fm,Q{—{Rl ~ R2)) fm,Q{P + Rl ~ R2) 

Let R := R2 — Rl- This equation becomes 

(p dA — fm,p(Q + R) fraA~R) fra,Q{ — P + R) 

~ UA-Q + R)AAR)fmAP-R) ' 

Fix two linearly independent m-torsion points P and Q. The right side of (3) 
is a rational function of R; call it = if{R). Since fm,p can have zeros and 
poles only at P and O, and fm,Q can have zeros and poles only at Q and O, this 
function ip{R) can have zeros or poles only at i? = —Q, Q, P — Q, P + Q, P, and 
O. By looking at the factors of ip we can check that at each of these points, the 
value of ip{R) is well-defined, because the zeros and poles cancel each other out. 
Since ip is & rational function on an elliptic curve which does not have any zeros 
or poles. Ip must be constant. Since for certain values of R, ip{R) = em{P,Q)^j 
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this must be the case for all values of R. Hence we may in particular choose 
i? = O, or equivalently Ri = i? 2 - So let Ri = i? 2 - By Lemma 1 below, 

fm,Q{Rl ~ R 2 ) _ 

U,q{-(Ri - R 2 )) ~ ^ ^ ’ 

and by assumption fm,p does not have a zero or pole at Q and fm,Q does not 
have a zero or pole at P. Hence expression (3) simplifies to 



era{P,Qf 



(- 1 )™ 



fm,p{Q) fm,Q{-P) 
fm,p{-Q) fm,Q{P) 



( 4 ) 



Lemma 1. Let f : E ^ ¥q be a rational function on E with a zero of order 
m (or a pole of order —m) at O. Define g : E ^ ¥q by g{X) = f{X)/f{—X). 
Then g(0) is finite and g{0) = (—1)"*. 

Proof. The rational function h{X) = x{X)/y{X) has a zero of order 1 at X = O. 
The function /i = f/h"^ has neither a pole nor a zero at X = O, so /i(0) is 
finite and nonzero. We check that the rational function (j>{X) = h{X)/h{—X) 
has no zeros and poles on E. Hence (p is constant. By computing (j>{X) for a 
finite point X = (x, y) on E with x,y ^ 0, we see that (p is equal to —1. Hence 



g{x) 



m) 

n-x) 



hjxrhjx) ^ f,{x) ^ MX) 

h{-x)^f,{-x) ^ M-X) ^ ^ M-xy 



and g{0) = (— 1)™. 



2.6 Estimated Savings 

In this section we compare our algorithm for the squared Weil pairing to Miller’s 
algorithm for the Weil pairing. We count operations in the underlying finite 
field, counting field squarings as field multiplications throughout. This analysis 
assumes that we use the short Weierstrass form for the elliptic curve E. 

In practice, some of these arithmetic operations may be over a base field 
and others over an extension field. That issue is discussed in more detail in [8]. 
Without knowing the precise context of the application, we don’t distinguish 
these, although individual costs may differ considerably. 



Miller’s algorithm. Miller’s algorithm chooses two points Ri, R 2 on E, and 
lets Ap = (P+i?i) — (i?i) andMg = (P+i? 2 ) — (^ 2 )- Recall that in the notation 
of Section 2.1, /_ 4 p is a function whose divisor is mAp. As in Section 2.2, let 
fj,Ap be a function with divisor (fj^Ap) = j{R + R-i) ~ j{Ri) ~ (jP) + {^)- Tbis 
is the function fj in the notation of [3, p. 61 If.]. Then fm,Ap = fAp- As pointed 
out in Equation (B.l) of [3, p. 612], (1) leads to the recurrence 

fi+j,Ap(-^Q) = fi,Ap{-^Q) ■ fj.Api-^q) ■ 

9ii+j)p[AQ) 



( 5 ) 
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During the computations, each fj,Ap{-^Q) is a known field element, unlike 
the unevaluated functions fj,Ap- Since Aq has degree 0, the value of fj,Ap{-^Q) 
is unambiguous, whereas fj^Ap is defined only up to a multiplicative scalar. 

To compute the Weil pairing we need 

( p Q\ — fAp{Q~^R2) IAq{Ri) _ fm,Ap{Q R 2 ) fm,AQ{Rl) 

~ fAp{R2) fAciP+Rl) ~ fm,Ap{R2) U,A^{P+RiY 

For integers j in an addition-subtraction chain for m, we will construct a tuple 
tj = [jP, jQ, Tij, dj] where rij and dj satisfy 

^3 fj,Ap {Q T R 2 ) fj,AQ (Ri) 

dj fj,Ap{R2) fj,AQ{P + Ri) 

To compute U+j from U and tj, one uses the above recurrence (5) to derive the 
following expression for rii+j /di+j : 

'^i+j ^3 9iP,jp{Q T -^2) 9{i+j)p{R2) 

^i+j 9iPJP {R 2 ) 9(i+j)p{Q + R 2 ) 

9iQ,jQ{Ri) 9ii+j)Q{P + Ri) 

9iQ,jQ{P + Ri) 9{i+j)Q{Rl) 

To evaluate, for example, 9iP,jp{Q + R 2 ) / 9 iP,jp{R 2 ), start with the elliptic curve 
addition iP+jP = {i+j)P. This costs 1 field division and 2 field multiplications 
in the generic case where iP and jP have distinct x-coordinates and neither is 
O. Save the slope A of the line giPjp{X) = y{X) — y{iP) — A(a;(X) — x{iP)) 
through iP and jP. Two field multiplications suffice to evaluate 9iPjp{Q + R 2 ) 
and 9 ipjp{R 2 ) given Q + R 2 and i?2- No more field multiplications or divisions 
are needed to compute the numerator and denominator of 

9 {i+j)pi.R 2 ) _ x(R 2 ) - x((i + j)P) 

g(i+j)p(Q + R 2 ) x{Q + R 2 ) - x{{i + j)P) ■ 

Repeat this once more to evaluate the last two fractions in (6). Overall these 
evaluations cost 8 field multiplications and 2 field divisions. We need 10 multi- 
plications to multiply the six fractions, for an overall cost of 18 multiplications 
and 2 divisions. 

Squared pairing. The squared pairing needs Umldm where Uj/dj is given 
by (2). The recurrence formula is 

P-i+j Pi Pj 9iP,jp{Q) 9{i+j)p{~Q) 9iQ,jQ{~P) 9(i+j)Q{P) 

di+j di dj gipjp{ Q) 9{i+j)p{Q) 9iQ,jQ{P) 9{i+j)Qi P) 

This time the update from ti = [iP, iQ, rii, di] and tj to ti+j needs 2 ellip- 
tic curve additions. Each elliptic curve addition needs 2 multiplications and 1 
division in the generic case. We can evaluate the numerator and denominator of 

9 tP,jp{Q) _ y{Q) - y{iP) - A(x(Q) - x{iP)) 

9iP,jp{-Q) y{-Q) - viiP) - A(x(-Q) - x{iP)) 

with only 1 multiplication, since x{Q) = x{—Q). 
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The fraction / g(^i^j)p{Q) simplifies to 1 since (/(i+j)p(X) depends 

only on x{X), not y{X). Overall 6 multiplications and 2 divisions suffice to eval- 
uate the numerators and denominators of the six fractions in (7). We multiply 
the four non-unit fractions with 6 field multiplications. 

Overall, the squared Weil pairing advances from ti and tj to ti+j with 12 
field multiplications and 2 field divisions in the generic case, compared to 18 
field multiplications and 2 field divisions for Miller’s method. When i = j, each 
algorithm needs 2 additional field multiplications due to the elliptic curve dou- 
blings. Estimating a division as 5 multiplications, this is roughly a 20% savings. 



3 Squared Tate Pairing for Elliptic Curves 

3.1 Squared Tate Pairing Formula 

Let m be a positive integer. Let E be defined over F^, where m divides q — 1- 
Let E{¥q)[m] denote the m-torsion points on E over F^. Assume P G E{¥q)[m], 
and Q G E(Fg), with neither being the identity and P not equal to a multiple of 
Q. The Tate pairing (j)m{P: Q) on E{¥q)[m] x E{¥q)/mE{¥q) is defined in [8] as 



with the notation and evaluation as for the Weil pairing above. Now we define 



Vm{P,Q) ■ = 



f fm,p{Q) 
\fm,p(-Q) 






where fm,p is as above, and call Vm the squared Tate pairing. To justify this 
terminology, we will show below that Vm{P, Q) = <p 7 n{P, Q)^- 



3.2 Algorithm for Q) 

Fix a positive integer m and the curve E. Given an m-torsion point P on E and 
a point Q on E, we want to compute Vm{P, Q)- As before, start with an addition- 
subtraction chain for m. For each j in the chain, form a tuple tj = [jP, rij, dj] 
such that 



^ ^ fjAQ) 
dj fj,p{-Q) 

Start with ti = [P, 1, 1]. Given tj and tk, this procedure gets tj+k'- 

1. Form the elliptic curve sum jP + kP = (j -I- k)P. 

2. Find the line gjp^kp{X) = cq + cix(X) + C 2 y{X). 

3. Set 



Uj+k = rij ■ Uk ■ (co -I- cix{Q) + C 2 y{Q)) 
dj+k = dj ■ dk ■ (co -I- cix{Q) - C 2 y{Q)). 
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A similar construction gives tj-k from tj and The vertical lines through 
(j + k)P and (j + k)Q do not appear in the formulae for rij+k and dj+k, because 
the contributions from Q and —Q are equal. When j + k = m, one can further 
simplify this to nj+k = nj ■ Uk and dj+k = dj ■ dk, since C 2 will be zero. When rim 
and dm are nonzero, then the computation of (8) with j = m is successful, and 
after raising to the {q — l)/m power, we have the correct output. If some rim or 
dm were zero, then some factor such as co + cia:((5) + C 2 y(Q) must have vanished. 
That line was chosen to pass through jP, kP, and {—j — k)P, for some j and k. 
It does not vanish at any other point on the elliptic curve. Therefore this factor 
can vanish only if Q = jP or Q = kP or Q = (—j — k)P for some j and k. In 
all of these cases Q would be a multiple of P, contrary to our assumption. 



3.3 Correctness Proof 



Theorem 2. Letm be a positive integer. Suppose P G E(¥g)[m] and Q G P(Pq) 
with neither being the identity and P yf ±Q. Then the squared Tate pairing is 



(!>m{P,Qf 



( fraAQ) V"~ 



Proof. Let i?i and i ?2 be as in the proof of Theorem 1. The proof proceeds 
exactly as the correctness proof for the Weil pairing. The only difference is that 
the factor of (—1)™ is missing in the Tate pairing and so we have 

. f fm,p{Q + ^2 ~ Rl) 

By the same argument as in the proof for the Weil pairing we may choose 
R 2 = Ri, which gives us the desired formula. 



3.4 Estimated Savings 

This analysis is almost identical to that for the Weil pairing in Section 2.6. 
When analyzing Miller’s algorithm for the Tate pairing, the main difference 
from Section 2.6 is that the analog of (6) has 2 fewer fractions to evaluate and 
combine. An elliptic curve addition costs 1 division and 2 multiplications, while 
2 multiplications are needed to evaluate the numerators and denominators of 
the two fractions. Then 6 multiplications are needed to combine the numerators 
and denominators of the 4 fractions. Therefore each step of Miller’s algorithm 
performing an addition costs 1 division and 10 multiplications. 

For the squared Tate pairing, the analog of (7) also has 2 fewer fractions in 
it. An elliptic curve addition costs 1 division and 2 multiplications, while only 

1 multiplication is needed to evaluate the numerators and denominators of the 

2 fractions. Then 4 multiplications are needed to combine the numerators and 
denominators of the 3 non-unit fractions. Therefore each step of the squared Tate 
pairing algorithm performing an addition costs 1 division and 7 multiplications. 




178 K. Eisentrager, K. Lauter, and P.L. Montgomery 



Overall, the squared Tate pairing advances from ti and tj to ti+j with 7 
field multiplications and 1 field division in the generic case, compared to 10 field 
multiplications and 1 field division for Miller’s method applied to the usual Tate 
pairing. When i = j, each algorithm needs one additional field multiplication 
due to the elliptic curve doubling. Estimating a division as 5 multiplications, 
this is roughly a 20% savings. 

Comparing our squared pairing to the algorithm from [2] , the algorithms are 
equally efficient in the case of general base points, where there is no cancellation 
of denominators in their algorithm. In [2] , the authors show that if the security 
multiplier is even (fc = 2d) and the x-coordinate of the base point Q lies in a 
subfield F^d, then the denominators in the Tate pairing evaluation disappear. 
This makes their method more efficient, but it is possible that adding this extra 
structure may weaken the system for cryptographic use. Also, in some situations, 
restricting to k even may not be desirable. 

4 Squared Tate Pairing for Hyperelliptic Curves 

Let C he a hyperelliptic curve of genus g given by an affine model = f{x) 
with deg / = 2^ + 1 over a finite field not of characteristic 2. The curve C has 
one point at infinity, which we will denote by Poo- Let J = J{C) be the Jacobian 
of C. If P = (x, y) is a point on C, then P' will denote the point P' := (x, —y). 
We denote the identity element of J by id. 

The Riemann-Roch theorem assures that each element P of J contains a 
representative of the form A — gPoo, where A is an effective divisor of degree g. 
In addition, we will always work with semi-reduced representatives, which means 
that if a point P = (x, y) occurs in A then P' := (x, —y) does not occur elsewhere 
in A. The effective divisor representing the identity element id will be gPoo- For 
an element D of J and integer i, a representative for iD will be Ai — gPao, where 
Ai is effective of degree g and semi-reduced. 

To a representative Ai — gPoo we associate two polynomials {at, bi) which 
represent the divisor. The first polynomial, ai(x), is monic and has zeros at 
the x-coordinates of the points in the support of the divisor Ai. The second 
polynomial, bi{x), has degree less than deg(oi(x)), and the graph of y = &i(x) 
passes through the finite points in the support of the divisor Ai. 

4.1 Definition of the Tate Pairing 

Fix a positive integer m and assume that F^ contains a primitive mth root of 
unity Cm- The Tate pairing, 4>m : J(F,)[m] x J{¥q)/mJ{¥q) -)> F*/F*™ ^ (Cm), 
is defined in [7, p. 871] explicitly as follows. Let D G J(Fg)[m] and E G J(Fg). 
Let hm.D be a function on C whose divisor is {hm.o) = mD. Then 

4‘m{D,E) := hm,D{P) “ G (Cm)- 

This pairing is known to be well-defined, bilinear, and non-degenerate. The value 
hm,D{E) is defined only up to mth powers, so we raise the result to the power 
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to eliminate all mth powers. Note that E is a, divisor on the curve C, not 
an elliptic curve. We also assume that the support of E does not contain Poo 
and that E is prime to the Ai’s. Actually E needs to be prime to only those 
representatives which will be used in the addition-subtraction chain for m, so to 
about log m divisors. 

Frey and Riick [7, pp. 872-873] show how to evaluate the Tate pairing on 
the Jacobian of a curve assuming an explicit reduction algorithm for divisors 
on a curve. Cantor [4] gives such an algorithm for hyperelliptic curves when the 
degree of / is odd. In Section 4.4 below, we use Cantor’s algorithm to explicitly 
compute the necessary intermediate functions. These functions will be used to 
evaluate the squared Tate pairing, but they could just as well be used to evaluate 
the usual Tate pairing. 



4.2 Squared Tate Pairing Vm for Hyperelliptic Curves 

Theorem 3. Given an m-torsion element D of J and an element E of J , with 
representatives D = Pi + P2 + • • • + Pg — gPoo and E = Qi+Q2 + ‘ — \~Qg~ 9P00 
respectively, with Pi not equal to Qj or Q'j for any i, j define 

Vra{D, E) := {h^MQl -Q'l+Q2-Q'2 + --- + Qg- • 

Then Vm{D,E) = E4>rn{T) , E)"^ where 4>rn{D,E) is the Tate pairing defined 
above. 



Proof. Recall that if Pi = {x, y) is a point on C, then P{ is the point (x, —y). 

Similarly, if P = Pi J- P 2 H 'r Pg ~ gPoo , let P' = P{ -I- P 2 -I 'r Pg ~ gPoo ■ 

For the proof, we will compute 2E). 

Observe that E — E' = Qi — Q'l + Q 2 — Q '2 + ' ” + Qg ~ Q'g ^ 2P in the 

Jacobian of C, since E + E' = {Qi + Q[ — 2Poo) H h {Qg + Q'g — 2Poo) ~ id. 

Let hm,D denote the rational function on C with divisor {hm.o) = rnP\ J- • • • -I- 
mPg — 2gmPao as above. Then the divisor of hm,D /hm,D' has the form 

( j = mPi — mPl -\ -I- mPg — mP' 

\hm,D' J ® 



SO {hm,D /hra,D') ~ 2toP in the Jacobian. That means we can use hm,D/hm,D' 
to compute the pairing (/>m(2P, 2E). If Q is any point on C, then we can see by 
comparing the divisors of the two functions that hm^niQ) = c- hm,D'{Q'), where 
c is a constant which does not depend on Q. 

Hence 



<t>^{2D, 2E) 



f hra,D{E - E') Y~ 

\hm,D-{E-E')) 





(q-l)/m 



( hra,D{E-E') Y- 

\hmAE'-E)) 



Since 2P) = 4>m{D, E^, it follows that 

A{D, Ef = ±{hmAQl -Q'i+--- + Qg- Q'g)A~^'^''"- 
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4.3 Functions Needed in the Evaluation of the Pairings 

Let D be an m-torsion element of J. For a positive integer j, let hj^o denote a 
rational function on C with divisor 

= jAi - Aj - (j - l)5Poo- 

Since D is an m-torsion element, we have that A^ = gPoo, so the divisor of hm,D 
is {hm.o) = mAi — m ■ gPoo- Each hj^D is well-defined up to a multiplicative 
constant. 

Given positive divisors Ai and Aj, we can use Cantor’s algorithm to find a 
positive divisor Ai^j and a function Uij with divisor equal to 

Ai ~\~ Aj Aij_j gPoD' 

We construct hj^oiE) iteratively. For j = 1, let hi^D be 1. Suppose we have Ai, 
Aj, hi^niE) and hj^niE). Let Uij be the above function on C. Then 

hi+j^D{E) = hi^D(E) ■ hj^D{E) ■ Uij{E). 



4.4 Algorithm to Compute Vm{D, E) 

Let D and E be as above. Form an addition-subtraction chain for m. For each 
j in the chain we need to form a tuple tj = [Aj, rij, dj] such that jD has 
representative Aj — 2Pqo and 

^j,D(,Q2) 

Let ti = [Ai, 1, 1]. Given ti and tj, let (ai, bi) and (uj, bj) be the polynomials 
corresponding to the divisors Ai and Aj. Do a composition step as in Cantor’s 
algorithm to obtain (a, b) corresponding to Ai + Aj, without performing the 
reduction step. Let d(x) = gcd(ai(x), aj(x), bi(x) + bj(x)). The output polyno- 
mials a, b, and d depend on i and j, but we will omit the subscripts here for ease 
of notation. If d(x) = 1, then a(x) = ai(x)aj(x), and b(x) is the polynomial with 
deg(6) < deg(a) such that y = b(x) passes through the distinct finite points in 
the support of Ai and Aj . 

The reduction step described in [4, p. 99J then replaces (a, b) by (a, b) where 
a = (/ — b'^)/a,b = —b (mod a) and deg(6) < deg(a). This reduction step is 
applied repeatedly until deg(a) < g. In the genus 2 situation, it follows from [4, 
p. 99] that at most one reduction step is performed. 



Case i. If 5 = 2 and deg(a(cc)) > 2, a reduction step is performed. If we let 



Vi,j(P) = 



a{x{P)) 

b(x{P)) + y(P)'' 



(9) 



and 



Uij(P) := Vij(P) ■ d(x{P)), 
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then (uij) = Ai + Aj — Ai^j — 2Poo, and 

u,,,{P) a{x{P)) b{x{P')) + y{P') d{x{P)) b{x{P')) + y{P') 

Uij(P') a{x{P')) b{x{P)) + y{P) d{x{P')) b{x{P)) + y{P) ' 

Let 

m+j := n* • Uj • (6 + y)(Qi) • {b + y){Q'2) 
di+j := di ■ dj ■ (b + y){Qi) ■ {b + y){Q 2 ). 

There is no contribution from a in rii+j and di^j because the contributions from 
Qi and Q'^ are equal. This improves the algorithm for the Tate pairing in [7]. 

Case ii. If g = 2 and deg(a(a;)) < 2, then Uij{P) = d{x{P)). In this case we let 
. — Tij * Tij and dj-j-j . — d^ * dj. 

Case iii. Suppose g > 2. If r reduction steps are needed, then to compute Uij, 
we obtain intermediate factors , . . . one factor as in (9) per reduction 

step. Then Uij will be the product Uij := ■ v^j ■ d{x{P)). 

Note: If we evaluate rii and di at intermediate steps then it is not enough to 
assume that the divisors D and E are coprime. Instead, E must also be coprime 
to Ai for all i which occur in the addition chain for m. One way to ensure 
this condition is to require that E and D be linearly independent and that the 
polynomial p(cc) in the pair {p{x), q{x)) representing E be irreducible. There are 
other ways possible to achieve this, like changing the addition chain for m. 

4.5 Estimated Savings for Genus 2 

Using a straightforward implementation of Cantor’s algorithm, the total costs for 
doubling and addition on the Jacobian of a hyperelliptic curve of genus 2 in odd 
characteristic, C : y"^ = f{x), where / has degree 5, are as follows. Doubling an 
element costs 34 multiplications and 2 inversions. Adding two distinct elements 
of J costs 26 multiplications and 2 inversions. More efficient implementations of 
the group law may alter the total impact of our algorithm. Different field multi- 
plication/inversion ratios and field sizes, as well as differing costs in an extension 
field will also affect the analysis, but these costs are chosen as representative for 
the purpose of estimating the savings. 

Analysis of standard algorithm. Let D := Pi + P2 — 2Pqo- Let i?i, R2, R3, 
R4 be four points on C such that Qi + Q2 — ‘ 2 -Poo ^ R\ + R2 — R3 — Ra in J. 
The algorithm in [7] computes U+j from p and tj, where U = [Ai, rij, dj] and 

hj^iA^Ri] hj^ji)[R2] 
dj hj^iA^R^] hj^ji){^R4] 

The expression for rii+j /di+j becomes 

rii+j _ Tii Tij Uij(Ri) Uij{R2) 
di dj Ui_^j (^ 3 ) 
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To form Uij, we have to perform an addition or doubling step to obtain Ai^j 
from Ai and Aj. This costs 34 multiplications and 2 inversions for a doubling, 
26 multiplications and 2 inversions for an addition. Then 






(P) = 



a{x{P)) 

b{x{P)) + y{P)^ 



and to compute {rii+j, dj+j), we need to evaluate Uij at four different points. 
Each evaluation of a{x{P)) costs 2 multiplications in a doubling step, 3 multi- 
plications in an addition step (square or product of monic quadratics). Evalua- 
tion of b{x{P)) (cubic) costs 3 multiplications. Finally we multiply the partial 
numerators and denominators out, using 5 multiplications each, including the 
multiplications with n^, rij, di, and dj. So the total cost for an addition step 



is 60 multiplications and 2 inversions, 
multiplications and 2 inversions. 



and the total cost for a doubling is 64 



Squared Tate pairing. The squared Tate pairing works with the divisor Qi — 
Q'l + Q 2 — Q '2 ^ 2Qi -I- 2 Q 2 — 4Poo. After adding Ai and Aj to obtain as 
above, we need to form 



dij-j di dj rAj^j((c^ 2 ) 

As can be seen from (10) above, no evaluations of a(x(P)) are needed. For i = 
1, 2, we need to evaluate b(x(Qi)) and b{x{Q'i)). This costs only 3 multiplications 
for each i, since the a;-coordinates of Qi and Q[ are the same. Finally, we have 
to multiply the partial numerators and denominators, for a total cost of 12 
multiplications for either a doubling or an addition. 

So the total cost for an addition step is 38 multiplications and 2 inversions, 
and the total cost for a doubling is 46 multiplications and 2 inversions. Estimat- 
ing an inversion as 4 multiplications, this is a 25% improvement in the doubling 
case and a 33% improvement in the addition case. 



5 Example: g = 2, p = 31, m = 5 

In this section, we evaluate the squared Tate pairing on 5-torsion on the Jacobian 
of a hyperelliptic genus 2 curve over a field of 31 elements. Let C be defined by 
the affine model = f{x) where f{x) = x^ + 13x'^ -I- 2x^ + 4x^ -I- lls -I- 1. 
The group of points on the Jacobian of C over F 31 has order N = 1040. Let D 
be the 5-torsion element of the Jacobian of C given by the pair of polynomials 
D = [x^ + 2ix + 15, 13x -I- 28]. Let E be the element of the Jacobian of C of 
order 260 given by the pair E = [x"^ + Ax + 2, 29a; -I- 20]. Then the squared Tate 
pairing evaluated at D and E is V 5 {D, E) = 4, where 

(a; -I- 26)^ (a;^ -I- 19a;^ -I- 23a;^ -I- 16a; -I- 19) (a;^ -I- 23a; -I- 15) 
x^ + 6 a;^ + 9x + 21 + y 
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To illustrate the bilinearity of the pairing, look for example at 2D = \x^ + 

25a; + 9, lOx + 6], 3D = + 25x + 9, 21x + 25], and 2E = + x + 3, 26a; + 3]. 

Then we compute that indeed v^{2D,E) = 16 = v^{D,E)‘^, with 

(x + 26)(x^ + 19x^ + 23x^ + 16x + 19)^(x^ + 25x + 9) 

(x^ + 6x^ + 9x + 21 + yY ’ 

and v^{D, 2E) = 16 = v^D, E^, with D as above. Also 
V5{3D,E) = 2 = vYD,EY (mod 31), 

with 

(x + 26)(x^ + 19x^ + 23x^ + 16x + 19)^(x^ + 25x + 9) 

“ (30x3 _p 25x2 + 22x + 10 + yY ' 
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Abstract. We use rational parametrizations of certain cubic surfaces 
and an explicit formula for descent via 3-isogeny to construct the first 
examples of elliptic curves Ek ■ — k oi ranks 8, 9, 10, and 11 

over Q. As a corollary we produce examples of elliptic curves over Q with 
a rational 3-torsion point and rank as high as 11. We also discuss the 
problem of finding the minimal curve Ek of a given rank, in the sense 
of both |fc| and the conductor of Ek, and we give some new results in 
this direction. We include descriptions of the relevant algorithms and 
heuristics, as well as numerical data. 



1 Introduction 

In the fundamental Diophantine problem of finding rational points on an elliptic 
curve E, one is naturally led to ask which abelian groups can occur as the 
group of rational points E{Q). Mordell’s theorem guarantees that E{Q) is finitely 
generated, so we have 



E{Q) = F(Q)tors © Z", 

where r is the rank of E. Mazur’s well-known work [Ma] completely classifies the 
possibilities for F(Q)tors> but the behavior of the rank remains mysterious. Part 
of the “folklore” is the conjecture that there exist elliptic curves with arbitrarily 
large rank over Q. But large rank examples are rare, and the record to date 
is 24 [MM]. One might further ask about the distribution of ranks in families 
of twists, or with prescribed Galois structure on the torsion subgroup; there is 
some evidence to suggest that conditions of this sort do not impose an upper 
bound on the rank. 

A classical question in number theory is to describe the numbers k that can 
be written as the sum of two rational cubes. This leads one to study the elliptic 
curves 



Ek ■■ = k 

for k G Q*. Clearly Ek and Ek' are isomorphic if k/k' is a cube, so we can and 
will restrict our attention to positive cubefree integers k. A Weierstrass equation 

D.A. Buell (Ed.): ANTS 2004, LNCS 3076, pp. 184-193, 2004. 
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for Ek is given by — 432/c^, where 



X = 



12fc 
y + x' 



Y = 36k 



y-x 
y + x' 



As long as k > 2, the group Afc(Q)tors is trivial, so Ek has a nontrivial 
rational point if and only if its rank is positive. The distribution of ranks in this 
family is not well understood. Zagier and Kramarz [ZK] used numerical evidence 
for k < 70000 to conjecture that a positive proportion of the curves Ek have rank 
at least 2; however, more recent computations by Mark Watkins [Wa] suggest 
that, in fact, a curve Ek has rank 0 or 1 with probability 1. Still, the following 
conjecture is widely believed: 



Conjecture 1. There exist elliptic curves Ek with arbitrarily large rank overQ. 



A proof of this conjecture seems beyond the reach of current techniques. So 
for now we content ourselves with constructing high-rank examples within this 
family (thereby adding to the body of supporting evidence) , and gathering more 
data on the distribution of ranks so as to be able to formulate more precise 
conjectures. The main results of this paper are examples of curves Ek with 
rank r for each r < 11. For r = 8,9, 10, 11 these are the first curves known of 
those ranks; for r = 6, 7 our curves have k smaller than previous records, and 
are proved minimal assuming some standard conjectures. For r < 5 we recover 
previously known k, and prove unconditionally that they are minimal. 

Throughout, we make use of the fact that the curves Ek are 3-isogenous to 
the curves 



E'k : uv{u + v) = k 



or, in Weierstrass form, E'j^ : + 16fc^, where 



t/=-, 

V 



V = 



8ku + Akv 



The isogeny is given by: 



2 ^ 

4>-.Ek^E'k, {x,y) ^ {u,v) = { — , ). 

X xy 

The dual isogeny, with respect to the Weierstrass equations for Ek and E'j., is 

'C/3-k64P V{U^- 128 Py 



■.E'k^Ek, {U,V)^{X,Y) = 



C/2 



C/3 



Applying Tate’s Algorithm [Ta] to the curves we find that a minimal 
Weierstrass form for is given by 



/■2 

z^ = w^ + - 

4 



U V 

iW,Z) = {-,-) 
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in the case that k is even, and 

Z^ + Z=W^+^^ iW,Z) = i^,^) 

in the case that k is odd. The primes of bad reduction for E'^ are the primes 
dividing k and the prime 3. For a prime factor p of k, the Kodaira type at p is 
IV* if I A: and IV if p\\k. If 3 | fc, the Kodaira type at 3 is III if fc = ±2 mod 9, 
and II otherwise. It follows that the conductor of E'f. is given by the formula 

N{E',) = l[p^+^- 

p\3k 

where /3p = 0 if p yf 3, and (3^ = 0, 1, or 3 for fc = ±2 mod 9, fc = ±1,±4 
mod 9, or 3 I fc respectively. 

The curves have the rational 3-torsion points {U, V) = (0, ±4fc). Since the 
rank is an isogeny invariant, we produce as a corollary to our work examples of 
elliptic curves E'^, with a rational 3-torsion point and rank as high as 11. Curi- 
ously, there are no other known elliptic curves over Q with a rational 3-torsion 
point and rank greater than 8 [Du]. 

In section 2 we describe the geometric underpinnings of our search technique, 
which heavily uses rational parametrizations of various cubic surfaces, the points 
of which correspond to pairs of (usually independent) points on the curves E^. 
Section 3 gives a formula for an upper bound on the rank of E^, using descent 
via 3-isogeny. Section 4 describes some of the specific algorithms we used to 
produce examples of Ek with high rank. Finally, we give our numerical results 
in Section 5. 

2 Cubic Surfaces 

The most naive approach to constructing curves E^ of high rank is to enumerate 
small points on the curves E^, which can be accomplished via the simple obser- 
vation that a point on some curve E^ corresponds to a pair of whole numbers 
(x, y) so that the cubefree part of (that is, the unique cubefree integer s 

such that {x^ + y^) /s is a perfect cube) is fc. The second author used essentially 
this approach to find the first known Ek of rank 7. By incorporating some more 
sophisticated techniques, such as 3-descent (see below), this approach could yield 
curves with rank as high as 8. The weakness of this method is that the number 
of such points up to height H grows as most of which lie on curves of rank 
1 and waste our time and/or memory. 

We can reduce this to by considering only curves E^ together with 

a pair of points, which correspond to points on the cubic surface 

Si : + x^ = y^ + 



other than the trivial points on the lines w + x = y + z = 0, w + y = x + z = 0, 
w + z = x + y = 0. (This pairs-of-points idea is also used in [EW] to produce 
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elliptic curves with high rank and smallest conductor known.) The cubic surface 
whose points correspond to pairs of points on the isogenous curves £1^, 

S2 ■ wx{w + x) = yz{y + z), 
and the “mixed” cubic surface 

Sz '■ wx{w + x) = y^ + z^, 

are also fruitful. Each of these cubic surfaces is smooth, and thus rational in the 
sense that it is birational to over Q; in fact, each has a rational parametriza- 
tion defined over Q. 

A parametrization of Si was found by the first author [El] , and a parametriza- 
tion of S'2 follows fairly quickly: there is an obvious isomorphism between Si 
and S'2, defined a priori over Q(-\/^), but which actually descends to Q. To 
parametrize S3, we used the following more general approach, provided by Izzet 
Coskun [Co]. 

Let S be a cubic surface defined over Q, and suppose Li and £2 are disjoint 
lines on S, with £3 a third line meeting both. Then there is a 3 -dimensional space 
of quadrics that vanish on this set of lines. Use a basis of this space to map S 
into P^; the inverse map will be a parametrization of S. The parametrization 
so obtained is defined over Q if all of the Li are rational, or if £3 is rational 
and £i,£ 2 are Galois conjugate. This construction realizes S as P^ blown up at 
six points; the six blown-down lines are the six lines (other than Li) that meet 
exactly two of the three lines £1, £2, £3. 

On both S'! and S3, the relative paucity of lines defined over Q means that, 
up to automorphism of the surface, there is only one choice for the configuration 
£1, £2, £3. Thus for Si the parametrization obtained with this technique, 

{w : X : y : z) = — 2 t^s — 2 tsr + ts^ + r^s — r^ — rs^ 

: — 2t^s + ts^ — 2tsr + 2rs^ — -|- — 2 r^s 

: 2t^ + 3 t^r — 2t^s — 2tsr + itr'^ + ts^ + 2rs^ — -I- — 2r^s 

: — 2 t^ + t^s — 3 t^r — 3 tr^ + 4 tsr — 2 ts^ — r^ + r^s — rs^), 

is equivalent to the one in [El], in the sense that one can be obtained from the 
other by composing an automorphism of Si with a projective linear transforma- 
tion of P^ . For S3 we obtain the parametrization 

{w : X : y : z) = {r^ — : r^s — s^t + t^r : r'^t — s^r — t'^s). 

In order to compute the k for which a particular point on S3 corresponds to a 
pair of points on and Ek, we must find the cubefree part of wx{w + x), which 
we do by factoring this number. It is therefore useful that wx(w + x), which 
is a polynomial of degree 9 in r, s, and t, decomposes as a product of three 
linear and three quadratic factors. By contrast, the factorization of + x^ 
in the parametrization of Si is as a product of one linear, two quadratic, and 
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one quartic factor; the difficulty of factoring the value of this quartic at (r, s, t) 
severely limits the usefulness of the S\ parametrization. 

On S 2 , there are, up to automorphism, seven different configurations of lines 
Li, L 2 , L^. Four of them lead to parametrizations where wx{w+x) has five linear 
and two quadratic factors; the parametrization of S 2 mentioned above is of this 
type. The other three lead to factorizations into three linear and three quadratic 
polynomials. There is not a significant computational advantage for one factor- 
ization over another, but we do mention here a rather elegant parametrization 
of S 2 obtained in this way: 

{w : X : y : z) = {—r^s + s^t : r^s — rt^ : —r^t + st^ : rs^ — st'^). 

3 Descent via 3-isogeny 

A powerful tool for obtaining upper bounds for the ranks of the curves Afe is 
descent, since these curves admit the aforementioned 3-isogeny with An 
analysis of the descent for these curves first appeared in [Se]. What follows is 
essentially a simplification of the formula given there. 

Let k = Op/i where ej = 1 or 2, and let qi be the primes dividing k with 
Qi = I mod 3. Now define a matrix A over F 3 by: 




if Pj ^ Qii and if Pj = Qi (equivalently, the rows of A 

sum to zero). Here denotes the cubic residue symbol mod q; note that for 
each q = 1 mod 3, there are two choices for this cubic residue symbol, but they 
lead to proportional rows Ai. li k = ±1 or 0 mod 9, add an additional row 
corresponding to cubic characters mod 9 for pj] if 9|fc, the entry corresponding 
to Pj = gi = 3 is defined as in general when pj = qi . 

If we let (j) : Ek ^ E'i. and </>' : — >■ Ek denote the relevant 3-isogenies, 

then the row and column null spaces of A correspond, respectively, to the 0- and 
0'-Selmer groups of Ek/Q and A(,/Q. We can conclude that, after taking the 
3-torsion of E'f. into account, 

rank(Afc) < ^irows -I- ^^columns — 2 • rank(A) — 1. 

4 Computational Techniques 

An application of the explicit formula for descent via 3-isogeny is another tech- 
nique for searching for curves of large rank, which tends to be more effective 
in finding the minimal k such that E]^ has a given rank (it was actually this 
technique that produced the current rank 9 record). The idea is to enumerate 
all possible k less than some given upper bound which have a sufficiently high 
3-Selmer bound, and then to search for points on these curves. To do this, we 
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recursively build up products of primes, and at each stage compute the portion 
of the matrix A corresponding to the primes chosen so far. Of course, the diag- 
onal entries remain in some doubt, since they depend on the whole row of A; 
still, at each stage a lower bound for the rank of the matrix A can be computed, 
and used to give a lower bound on the number of primes still needed. In this 
way one can vastly reduce the search space. A similar approach can be used 
to enumerate candidate curves whose conductor is smaller than a given bound, 
with the formula for the conductor given in the Introduction. 

It is also important to have a way to guess which curves are the most promis- 
ing before committing to a lengthy point search. The most important tool in 
this regard is provided by a heuristic argument suggesting that high rank curves 
should have many points on their reductions modulo p, or more specifically. 



n 

p^x 



*E{¥p) 

p+l 



(log a:)’’ 



where r is the rank of E. This formula was conjectured by Birch and Swinnerton- 
Dyer in [BSD], and the idea of using it to find elliptic curves of high rank is due 
to Mestre [Me]. 

Note that for the curves Ek it is only useful to consider primes p = l mod 3, 
since Ej^ is supersingular mod p when p = 2 mod 3. Furthermore, computing 
=f^Ek{¥p) is quite fast: modulo each prime p = 1 mod 3, there are only three 
isomorphism classes of Ek, corresponding to the three cubic residue symbols 
mod p. We compute the Op’s for each of these isomorphism classes once for all, 
and then to find the Op for a given curve Ek, we need only compute the cubic 
residue symbol of k mod p. 

In the end, though, we must still search for points on curves we suspect of 
having high rank. But here, too, there are improvements over the most naive 
approach. As noted above, points on Ek correspond to pairs of whole numbers 
{x, y) such that k = d~^{x^ + y^). Of course, we may further assume that x and 
y are relatively prime. It follows that gcd(x + y,x'^ — xy + y'^) is either 1 or 3, 
whence x + y must be a factor of 3fc times a perfect cube. For each x + y and d, 
we must simply decide if the resulting quadratic equation has a rational solution. 
Furthermore, we can use local conditions to reduce the number of possible x + y 
we must consider (this is closely related to the descent described in the last 
section). A similar approach works for a point search on xy{x + y) = k. 



5 Results 

Here we list the minimal known k such that Ek has rank r for each rank r < 11, 
as well as the minimal known conductor of a curve Ek of rank r with r < 8. 
We include notes where relevant, and r independent points for each of the new 
record curves. The points are listed on the minimal Weierstrass equation for 
as described in the Introduction. To transfer points back to the curve Ek one 
may use the dual isogeny (j) described there. 
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rank 

0 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 
11 



k (minimal known with Ek of given rank) 

I 

6 = 2-3 
19 (prime) 

657 = 3 • 3 • 73 

21691 = 109- 199 

489489 = 3 • 7 • 11 • 13 • 163 

9902523 = 3 • 73 • 103 • 439 

1144421889 = 3 • 13 • 19 • 41 • 139 • 271 

1683200989470 = 2 • 3 • 5 • 7 • 11 • 13 • 17 • 29 • 41 • 47 • 59 

349043376293530 = 2 • 5 • 37 • 41 • 53 • 73 • 1231 • 4831 

137006962414679910 = 2 • 3 • 5 • 7 • 23 • 31 • 37 • 43 • 83 • 109 • 151 • 421 

13293998056584952174157235 

= 3 • 5 • 7 • 13 • 19 • 23 • 31 • 43 • 59 • 61 • 73 • 79 • 103 • 109 • 157 • 457 



The following records for ranks up to 5 are known to be minimal (the proof for 
ranks 4 and 5 seems to be new); for ranks 6 and 7, they are minimal provided 
the weak Birch and Swinnerton-Dyer Conjecture and the Generalized Riemann 
Hypothesis are true for all L{Ek',s) with k' < k. The records for ranks 8 
through 10 are likely to be minimal. In each case, we use the approach described 
in the last section to enumerate all of the smaller k with a sufficiently large 
3-Selmer group. For each of them we compute a partial product of L{Ek, 1) over 
the first 1000 or so primes; a large partial product should correspond to high 
rank. In each case of rank 8 through 10, the record k significantly distinguished 
itself from all smaller k. Note that for large r our record value of k tends to 
have considerably more prime factors congruent to 1 mod 3 than to 2 mod 3. 



rank 

0 

1 

2 

3 

4 

5 

6 

7 

8 



k {Ek of given rank and minimal known conductor) 

1 

9 = 32 
19 (prime) 

657 = 3 • 3 • 73 

34706 = 2 • 7 • 37 • 67 

763002 = 2 • 32 • 19 • 23 • 97 

24565833 = 3^ • 17 • 307 • 523 

1144421889 = 3 • 13 • 19 • 41 • 139 • 271 

23381862574950 = 2 • 3^ • 5^ • 11 • 19 • 23 • 31 • 83 • 4201 



The k in the following chart are known to correspond to curves Ek of mini- 
mal conductor for k < 3. For ranks 4, 5, and 6, they are minimal provided the 
weak Birch and Swinnerton-Dyer Conjecture and the Generalized Riemann Hy- 
pothesis are true for all L{Ek', s) with k' < k. The records for ranks 7 and 8 are 
likely to be minimal; as above, Mestre’s heuristic distinguishes them markedly 
from all curves of smaller conductor. It is striking that for all ranks less than 
8, the k corresponding to minimal conductor are squarefree away from 3. Since 
divisibility by as opposed to p in k does not alter the value of the conductor 
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of Ek, one might expect that for high rank, the k corresponding to the curve of 
minimal conductor would have many square factors. 

We conclude by listing independent points on the record curves E'f. for ranks 
6 through 11, all of which were newly found using methods described here. A 
program that implements LLL reduction on the lattice of points of Ek, provided 
by Randall Rathbun [Ra], was used to reduce their heights where possible. 

k = 9902523, r = 6 

6 independent points on the Weierstrass minimal curve for E'i_: 

y‘^ + y = x^ + 24514990441382 

(100092, 32051170), (-6798, 4919434), (-22338, 3656314), 

(43672, 10383069), (-11988, 4774114), (126720, 45380386). 

k = 24565833, r = 6 

6 independent points on the Weierstrass minimal curve for E'f_: 

y2 + y = 2,3 150870037745972 

(-37656, 9872932), (86292, 28167835), (187270, 81966083), 

(-32058, 10859260), (-39798, 9372019), (236572, 115719195). 

k = 1144421889, r = 7 

7 independent points on the Weierstrass minimal curve for E'f_: 

y"^ + y = x^ + 327425365005582080 

(267748, 588744383), (1235988, 1488490048), (-333330, 538877944), 

(-648774, 233133847), (-422760, 501863680), (5104008, 11545190143), 
(-688974, 19483912). 



k = 1683200989470, r = 8 



8 independent points on the Weierstrass minimal curve for E'f_: 

y"^ = x^ + 708291392738196762720225 

(-88860785, 81396479060), (-87348261, 204569627262), 

(-63256830, 674665720365), (-40588401, 800890328532), 

(101707060, 1326794024465), (-35705670, 814107128835), 

(-44793980, 786391914385), (8308684440, 757353550270065). 

In fact, the curve for k = 1683200989470 has the remarkable property 
that the Diophantine equation xy{x + y) = k actually has 8 integral solutions, 
namely (11,391170), (533,55930), (770,46371), (1003,40467), (2639,23970), 
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(6970,12441), (7293,1197), (8555,10387). These 8 solutions, considered as ra- 
tional points on are independent. 

k = 23381862574950, r = 8 



8 independent points on the Weierstrass minimal curve for E'f_: 

= x^ + 136677874368461861091875625 

(1826174700, 78910161016275), (794409100, 25259021976275), 
(-259483950, 10918164759975), (499986216, 16176140969139), 
(-503804925, 2966885463150), (798185799, 25400836633032), 

(165873591, 11884516327764), (215137494, 12109307517153). 

k = 349043376293530, r = 9 

9 independent points on the Weierstrass minimal curve for E'f.: 

= x^ + 30457819633596695100179965225 : 

(-734843410, 173381106196815), (5130038900, 406775844709485), 
(-2676929565, 106184137901590), (690947990, 175464197227935), 
(291207945620, 157146639625792365), (25120488440, 3985280944128435), 
(-872639080, 172607374924365), (2918890200, 235215923409485), 
(-102315705, 174518619468760). 

k = 137006962414679910, r = 10 



10 independent points on the Weierstrass minimal curve for i?(,: 

y^ = x^ + 4692726937524378378756566939402025 : 

(-135797482140, 46781315964225555), (-150436201545, 35891470127810220), 
(-42200591214, 67952721291406041), (2327642247924, 3551854243978575507), 
(5504535148140, 12914782107290941395), (140506152430, 86409469562070095), 
(397507563420, 259814927561209005), (7162660587075, 

19169656506442936830), 

(73148794740, 71303068454026605), (-102758626586, 60063833881519937). 
k = 13293998056584952174157235, r = 11 

11 independent points on the Weierstrass minimal curve for i?(,: 

y-^ + y = x^ + 44182596082121121317135170025680399046545625711306 : 

(-30156002278649820, 4093799681127459731025817), 

(11364087102067560, 6756491872572362690626342), 

(-20835788771691894, 5927660006237675713476241), 

(1134264920569989390, 1208031685828825118221478017), 
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( 8907565209691176834 , 26585114133655761890666064910 ), 
( 111849199886121334 , 37992674604901443769570910 ), 

( 11724873521668020 , 6767159346634715672034457 ), 

(- 138658831412368575 / 4 , 12719819443574268333325811 / 8 ), 
( 165971060901522240 , 67941788876402816577138982 ), 

( 994768217796990 , 6647073075327662243966017 ), 

( 532896351059436225 / 16 , 576457310785324883248677823 / 64 ) . 
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Abstract. The elliptic curve primality proving algorithm is one of the 
fastest practical algorithms for proving the primality of large numbers. 
Its fastest version, fastFCPP, runs in heuristic time 0((log V)^). The aim 
of this article is to describe new ideas used when dealing with very large 
numbers. We illustrate these with the primality proofs of some numbers 
with more than 10,000 decimal digits. 



1 Introduction 

The work by Agrawal, Kayal and Saxena [1] on the existence of a deterministic 
polynomial time algorithm for deciding primality stimulated the field of primality 
proving at large. As a result, this caused the study and implementation of a fast 
version of the elliptic curve primality proving algorithm (ECPP). We refer to [2] 
for a presentation of the method and [13] for the description of the faster version, 
originally due to J. O. Shallit. Whereas ECPP has a heuristic running time of 
0((log V)®) for proving the primality of N, the new algorithm has complexity 
0((log V)^). This new approach enabled one of us (FM), to prove the primality 
of numbers with more than 7, 000 decimal digits. 

Independently, three of us (JF, TK and TW) started to write a new imple- 
mentation of ECPP in November 2002 which was available by December 2002, 
and this was improved step by step until the team working in Bonn came up with 
a set of programs used to prove the primality of 10®®®® -|- 33603 on August 19, 
2003. 

The two teams decided after this to exchange ideas and comparisons, forming 
the present article that concentrates on issues regarding distributed implemen- 
tations of fastECPP and its use in the proving of very large numbers. The theory 
of fastECPP will be described more fully in the final version of [13]. 

Our article is organized as follows. Section 2 provides a short description 
of fastECPP. Section 3 gives two strategies for distributing the computations. 
Section 4 deals with a faster way of looking for small prime factors of a bunch 
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of numbers at the same time. This part has an independent interest and we 
think that it could be useful in other algorithms. In Section 5, an alternative 
to the method of [9,5] for the root finding in the proving steps of ECPP is 
described. Section 6 deals with the use of fast multiplication beyond the GMP 
level in order to speed up all basic multiplications. In Section 7, we describe an 
early abort strategy for limiting the number of steps in ECPP. We conclude with 
timings on primality proofs for some very large numbers, obtained with either 
implementation. 



2 The Fast Version of ECPP 

Ordinary ECPP is described in [2] and fastECPP in [13]. We sketch the descrip- 
tion of the latter, assuming the reader has some familiarity with the algorithm. 

We want to prove that N is prime. The algorithm builds a so-called downrun 
that is a sequence of decreasing probable primes Nq,Nx,. . . ,Nk such that Nq = 
N and the primality of each Ni is sufficient to prove that of V_i. Theory tells 
us that we should anticipate a length of fc = 0(log N) for the sequence. 

If q is an odd prime, put q* = (— add to this special primes 
—4, —8,8 as explained in [13]. 

The algorithm runs as follows: 

[Step 1.] 

1.1. Find the r smallest primes q* such that (^) = 1, yielding Q = {q*, q^, 

..., q*}. 

1.2. Compute all ^/q* mod N for q* G Q. 

1.3. Try all subsets of distinct elements of S = {q*^,q*^,... ,q*^} of Q for 

which —D = '?* ^ until a solution of the equation 

m = u^ + Dv^ ( 1 ) 

in rational integers U and V is found, which involves computing ^—D mod N 
and use Cornacchia’s algorithm. When this is the case, let {U\, . . . U^} be the 
different U-values (we have at most w = 6 and generally w = 2). 

[Step 2.] For all Ui’s, compute nn = N + 1 — Ui] if none of these numbers can 
be written as cN' with c a i?-smooth number and N' a probable prime, go to 
Step 1. If there is a good one, call it to. 

[Step 3.] Build an elliptic curve E over Q having complex multiplication by the 
ring of integers of K = Q{^—D). 

[Step 4.] Reduce E modulo N to get a curve E of cardinality to. 

[Step 5.] Find P on E such that [N']P = Oe- If this cannot be done, then N is 
composite, otherwise, it is prime. 



[Step 6.] Set N = N' and go back to Step 1. 
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Note that what differentiates ECPP from its fast version is Step 1. In Step 
1.3, we only consider fundamental discriminants, as a curve with CM by a non- 
principal order has the same cardinality as one with CM by the principal order. 

ECPP is a Las Vegas type algorithm. Its running time cannot be proved 
rigorously, but only in a heuristic way using standard hypotheses in number 
theory. When given a number N, it can answer one of three things: “N is prime” , 
“N is composite” or “I do not know”. In the first two of these cases, the answer 
is definitely correct and there is an accompanying proof that can be verified in 
polynomial time. The problem is in showing that the third case happens with 
very low probability. 

In real life, programs implementing (fast)ECPP should follow this philosophy 
and never return something wrong. When the third answer is returned, this 
corresponds very frequently to the fact that the program ran out of precomputed 
data (such as discriminants, or class polynomials) or used too small factorisation 
parameters in Step 2. The programmer has to correct this and start again with 
the number. We never saw a number resisting indefinitely, though we cannot 
prove none exists. 

All algorithms and tricks [11,12] developed over the years for ECPP apply 
mutatis mutandis to the new version. This includes the invariants developed in 

[6.4] and the Galois approach for solving the equation iJ£)[u](A) modulo p (see 

[9.5] ) needed in Step 3, which favors smooth class numbers. 

When dealing with very large numbers (10000 decimal digit numbers, say), 
every part of the algorithm should be scrutinized again, which includes optimiz- 
ing the basic routines beyond the current level of GMP. In Step 2., B-smooth 
numbers are to be identified. The number B is important in the actual running 
time, and its precise value must be set depending on the algorithm used. See 
Section 4 below. A new strategy (early abort) is described in section 7. Also, 
Step 3-4 can be merged, as explained in section 5. 



3 First Strategies for Distribution 

Step 1-2 and Step 3-5 are easy to distribute over clusters of workstations. In this 
section, we give two distribution strategies. 

3.1 Strategy 1 

The following is easily implemented when all slaves have the same computing 
power, making it a parallel implementation. 

51. The master sends to each slave a batch of ^^/q* to compute. 

52. Each slave computes its batch and sends the results back to the master. 

53. The master sends all the squareroots to all the slaves, so that each slave 
can compute any yt—D that is needed. 

54. The master sends batches of D’s to all the slaves. Each slave is responsible 
for the resolution of (1) and the factorization of the m’s. If one is found, it is 
sent back to the master which checks the results and restarts a new phase. 
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S3 needs synchronization and communications. 

In S4, load balancing is not easy, since the results are probabilistic in nature 
(for which D does N split?). A probabilistic answer is to compute beforehand the 
average number of splitting D's that can occur. By genus theory, each D with t 
prime factors has splitting probability g{—D)/h{—D) with g{—D) = 2*“^. This 
suggests to build the whole set of discriminants D in Step 1.3 and to sort them 
with respect to {h{—D)/g{—D),D). One could also add a criterion describing 
the difficulty of building the class polynomial Hd{X) later on, maybe using the 
height of the polynomial (as evaluated in [4]) . We send to each slave discriminants 
Hi j , Di ^ , . . . , Dii_ in such a way that 

3 

(the value of 5 is somewhat arbitrary) which corresponds to the fact that on 
average, 5 values of D will be splitting values. Of course, this quantity should 
depend on the power of the slave. 



3.2 Strategy 2 

Another approach would be to set up a complicated system in which the master 
keeps trace of the work of each slave and decides what kind of work to do at 
some point. This is easily implemented on the side of the slaves: they wait for a 
task from the master, do it and send the result back. We now describe a possible 
implementation of the master. 

There are 6 different tasks which the slaves work on: 

Tl. Check whether the class number for a discriminant D is good, i.e. is not 
too big and does not contain a very large prime factor. 

T2. Compute a modular square root ^q* modulo N . 

T3. Try to solve (1) for a given D. 

T4. Do trial division for an interval of primes and a batch of m’s and return 
the factored parts. See Section 4. 

T5. Do a Fermat test. 

T6. Do a Miller-Rabin test. 

The master keeps lists of tasks of these six types which at the beginning are 
all empty. If all task lists are empty the master creates new tasks of type Tl. 
The tasks are sorted according to their priority, Tl having lowest and T6 having 
highest priority. If a slave requests a new task the master selects from all available 
tasks one with the highest priority and passes it to this slave. A completed task 
will create a varying number of new task, e.g. a computed square root (T2) may 
create many tasks of type T3 whilst a Fermat test (T5) will only on success 
create a task of type T6. If a certain number of tasks T6 for the same integer 
are successful one reduction step is finished and pending tasks are cancelled. 
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4 An Optimized Test for the Divisibility by Small Primes 

Let us first analyze the effect of trial division on the number of pseudo-primality 
tests (the most time consuming part of our implementation). Suppose we do 
trial division up to B on a number N = fR where / is only divisible by primes 
< B and R only divisible by primes > B. Let us assume that log B < -^log N, a 
condition which is almost always satisfied in practice. Let Pmax(?^) be the largest 
prime divisor of n. One can combine the prime number theorem with Rankin’s 
trick and Mertens’s theorem [14, 9.1.5 and 9.1.8] and related facts to investigate 
the sums 



Pmax(/)<^ 

I = log(/)7r(a;//) 

Pmax(/ 

where tt{x) counts the number of primes below x. Assuming log B < -^log x, we 
find them to be 



_ a: exp( 7 ) (log(R) + 0 ( 1 )) 

* log(a;) 

_ X exp(y) (log(R)2 + 0(log B)) 
log(:r) 

where 7 is Eulers constant. Since s counts the number of < a; for which R 
is prime, while I is the sum of log(/) over such N, one concludes that for a 
randomly chosen N € [(1 — e)a;,a;] with a fixed positive e < 1 and for a; — >■ 00 , 
z — >■ 00 subject to logR < ^loga:, the probability that R is prime tends to 
exp(y) log(R)/log(A) while the expectation value of log(/) is log(R) + 0(1). By 
this heuristic argument, one expects the number of pseudo-primality tests for a 
reduction of fixed size to be proportional to (log 5)“^. 

We now describe how to perform the trial division in Step 2 more efficiently 
by doing it on many numbers simultaneously. This is essentially a simplification 
of the algorithm in [3]. Let A be a (pseudo)prime for which we want to do a 
reduction step and let mi, ... , m^ be the numbers computed in Step 2 (we may 
choose t of order 2 °og^ ) ■ For simplicity we assume that t = 2 “ is a power of two. 
The following algorithm strips the primes up to B from the numbers m^: 

1. Build the product P = rip<B p primed using a binary tree. Unless the bound 
B is changed this has to be done only once. 

2. Compute the product M = [Q . m* as follows: Set m-°^ = rm and for a = 

1,... ,M successively compute m-“^ = 1 < * < 2“~“. Set 

M = rn'-^\ 

3. Compute M = P mod M and set mi = M. 
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4 . 



5. 



Compute rhi = P mod rrii as follows: For a = u, 



m. 



,(a) 

(0) 



mod 



Ja-l) 



and 



~ (o— 1) ~ (a) 



mod ' 



da-l) 
hi 



. . , 1 compute = 

, 1 < i < 2“~“. Set ?fii = 



For 1 < i < t replace repeatedly mj by gcd{mi,rhi) and rrii by ^ until 
fhi = 1 . 



Remarks: 

Note that for P < M we can save some of the top level computations and the 
application of the algorithm becomes equivalent to several applications with a 
smaller u, grouping the rrii appropriately. So we assume that u is chosen such 
that P > M holds. 

If we store the partial products 



= n 

p< ^ ,p prime 

which are computed during the precomputation we can decrease the trial division 
bound by powers of 2 with no extra effort. 

Step 5 can be improved e. g. by replacing rhi by gcd(mj, rhi^) in the iteration. 

We now analyze the cost of the algorithm. Let M{n) denote the cost of a 
multiplication of two numbers of size exp(n). We assume that the FFT is used 
and set M{n) = 0(n(log n)^+'^). The first step is a precomputation whose cost 
is 0{B{logB)^). The cost for the second step is 

U — 1 

^2^-i-feM(2'=logAf) = 0{u2^log{N){log{2^logN)Y+^) 
k=0 

since all rrii are of size N. In the fourth step the operation count is the same 
except that an n ■ n-multiplication is replaced by an 2n : n-division. Since the 
latter is asymptotically as fast as the former the cost for step 4 is the same as for 
step 2. Since P « exp(R) the cost of the third step is 0(i?(log(2" log fV))^+'^). 
The last step as described above has complexity 0(2“(logfV)^(loglog 
since the iteration ends after at most log 2 N steps each consisting of a division 
and a gcd. This could be improved by modifying this step but we do not need it 
here. Note also that in practice this step consists mainly of the first gcd(mi, rhi), 
which with high probability is very small, and the number of iterations also is 
very small. 

Assuming 2" < log N and neglecting log log-terms the time spent in pseudo- 
primality tests is 0{ ) for a reduction of size logR whereas the time for 

trial division is 0{B + (logN)^). So it is optimal to choose B = 0((logfV)^) 
which also implies that the cost for the precomputation can be neglected. 

Some remarks about storage and parallelization: 

The algorithm above needs a lot of memory; most of it at the end of the 
computation of P. To reduce the memory requirement we may compute partial 
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products Pj of the primes below B whose product is P and modify step 3 into 
computing the residues Pj mod M and the modular product of these residues. 
For this to be efficient the partial products should be larger than M. 

For a distributed implementation we propose to split P into as many pieces 
as slaves are present. Each slave executes steps 2-4 of the algorithm for its own 
Pj and passes at the end gcd{mi,rhi) = gcd{mi,Pj) to the master. The master 
assembles these informations and executes step 5 which in practice is very fast. 

For the number -|- 33603, the bound B was set to 2®®. 

5 Computing Roots of the Hilbert Polynomial Modulo p 

The run-down sequence contains, among other things, a list of pairs {p, D), where 
p is a pseudo-prime and where it is expected that an elliptic curve with complex 
multiplication by the ring Ok of integers in K = Q(-\/— D) can be used to derive 
the primality of p from the primality of some smaller number. It is necessary to 
compute an element jp of Fp which is the j-invariant of an elliptic curve over Fp 
with complex multiplication by Ok- 

We outline the method which was used to perform this step for the run- 
down sequence of 10®®®® -|- 33603. As in [9], the idea is to split this task into 
several steps, each one involving the determination of the modular solution of 
an equation of degree ii, where the £i are the prime factors of the class number 
h oi K. One difference is that [9] tries to compute a sequence of polynomials 
which define a sequence of intermediate fields terminating in the Hilbert class 
field L of K. By contrast, the implementation which was used for 10®®®® -|- 33603 
constructs methods to reduce elements x of the intermediate fields modulo an 
appropriate prime ideal over p, increasing the subfield in each step. The element 
X is given by floating point approximations to its conjugate algebraic numbers. 
Thus, the sequence of intermediate fields occurs in both methods but otherwise 
the language used is somewhat different, making it difficult to explain to what 
extent the methods are similar. Since the available space is not sufficient for a 
careful description of the new method, we give a short example explaining how 
it works in the case p = 479, D = 335. 

The program chooses the modular invariant x = X\^ from [2, 2.7.1] and a 
precision of 32 bit is sufficient for the floating point calculations. The class group 
of AT, and therefore Gal(L/AT), is cyclic of order 18. The program has selected a 
generator a of the Galois group, and has computed the complex numbers a'^{x). 
It has then decided to choose the prime ideal po C Ok such that 

2b 

7B 

where -\/i5 = 18.30300521772312668 is the positive square root of D, the subex- 
pressions in square brackets are in real life floating point numbers which will 
be rounded to nearest integers, and the factor 12 in the second summand is of 
course a square root of —D mod p. 



] ) mod p) , (2) 



(a -I- bV-D) mod po = ^ ((Pa] + 12 [ 
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The Hilbert class field has unique subfields Li and L 2 of degrees 2 and 3 over 
K. The program knows that the genus field Li, which is the largest subfield of 
L which is Abelian over Q, is given by L\ = K{-\/b). It decides to work with the 
prime ideal pi C such that, for each element z of Li given by a complex 
floating point approximation to z and cr(z), we have 

z mod pi = i ^((z + cr(z)) mod po + 196((z — a{z))V5 mod po)^ , (3) 

where y/E = 2.23606798 and the factor 196 in the second summand is of course 
a modular square root of 1/5 mod p. The reductions modulo po occuring in this 
formula are computed by the program using (2). 

It is more difficult to describe a prime ideal p 2 C Olz with p 2 fl Ok = Po in 
a way which is suitable for calculations. The program considers 

5 

X2 = 

which is an algebraic integer. We have 

X 2 = -60.2484307 + 78.0404771v^ 

(j{x 2 ) = -14.7805113 - 15.4588718v^ 
a‘^{x 2 ) = -10.4710580 + 1.47891293v^. 

Note that (t'^{x 2 ) depends only on i mod 3. The program computes a complex 
floating point approximation to the minimal polynomial of X 2 over K and finds, 
using (2) to reduce polynomial coefficients modulo po, that this polynomial is 
congruent to + 283T^ + 226T + 108 modulo po- It finds that 341, 395 and 418 
are the roots of this polynomial modulo p and decides to work with the prime 
ideal p 2 C L 2 such that X 2 = 341 (mod P 2 ). It computes coefficients Vi G Fp, 
0 < z < 3, such that 



2 2 

zmodp 2 = {X 2 )cr^ (z)) modpo), (4) 

j—0 

where in practice the reduction mod po is carried out using (2). For this to 
be possible, it is necessary that X 2 generates a normal basis of Ol 2 over Ok 
after localisation at po. The program will abort if this assumption fails. This 
does not happen in this example, nor did it ever happen during the calculations 
for + 33603. But it should be possible to construct examples of failure 

of the program, although it is very unlikely for this to happen in practice. In 
order to determine the coefficients of (4), it is also necessary that the modular 
roots of the minimal polynomial have been ordered in such a way that = 
cr^{x 2 ) mod p 2 . The choice of is of course free, since a different only gives a 
different p 2 . But the order of the other modular zeros is no longer free and the 
program has to compute them in the correct order. We will describe in a different 
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paper how this is done without a major increase of the required precision, where 
of course the order of the cyclic extension will often be larger than 3. Once the 
have been computed in the correct order, it is a linear algebra task to determine 
the Vi such that (4) holds for 2: = a^{x 2 ), 0 < fc < 3. In the given example, the 
result is 



uo = 417 vi = 170 V 2 = 393. 

The compositum = L 1 L 2 has degree 6 over K , and there is a unique prime 
ideal ps C OL 3 such that p3 0 = pi for i G {1; 2}. If an element 2; of OL 3 is 

given by complex floating point approximations to cr*(2;), where 0 < t < 6, then 

2: mod p3 = i (zo mod p2 + 196(zi mod po)) , (5) 

where the Zo,i G ^2 are given by 

a’-(zo) = cr"(z) + cr*+^(z) 
cr"(zi) = V5(a"(z) - (T*+^(z)), 

and Zi mod p2 is computed by (4). 

The program now computes a complex floating point approximation to cr*(P) 
for 0 < i < 6, where P is the minimal polynomial of x over L3. Using (5), it 
finds P to be congruent to + 151T^ + 434T + 346 modulo P3. The largest 
rounding error was 0.000488281. One modular root of P is 153. This means that 
there exists a prime ideal p dividing p in Ol such that x = 153 mod p. From x, 
one can compute the j-invariant of an elliptic curve with complex multiplication 
by Ok using the formula 

ix'^ + + 20a;^ + 19a; + l)^(x^ + 5a; + 13) 

3 = • 

X 

With X = 153 mod 479, this gives j = 307 mod 479. 

Calculating the minimal polynomial of x over K and reducing it modulo po, 
using (2), turns out to be impossible with 32 bit precision. If 48 bits are used, 
the largest rounding error is 0.0195312. Of course, this increase of the required 
precision is due to the fact that the theory of the genus field was not used. 

The program used for + 33603 was a development version, with many 

possible optimisations not yet implemented. For instance, it is clear from the 
above example that not all Weber class invariants were implemented. 

6 Use of the Fastest Fourier Transform in the West 

For most of the calculations for 10®®®® + 33603, we used integer multiplication 
using the Fastest Fourier Transform in the West (see http://www.fftw.org 
and [7,8]). To square a number of size 10®®®®, it was broken into 1661 digits of 
size 20 bit. These digits were inserted into an array of 3600 double variables. 
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which was then transformed using the functions provided by libfftwS.a, the 
result was squared and transformed back also using libf f tw3 . a. The same thing 
can be done for a product of two different factors, and if a factor occurs often 
then its Fourier transform may be precalculated and stored to reduce the time 
for a multiplication by this factor to the time for a squaring. It is easy to see 
that this choice of parameters does not guarantee exact results. Therefore, we 
also calculated a 32-bit checksum. If the checksum test indicates an error, the 
multiplication is recalculated using the GMP function. In the case of the pIOOOO, 
these recalculations appear to be rare, if they occur at all. Of course, even the 
checksum does not make this multiplication method rigorous. 

We used this fast multiplication both for primality tests and for the calcula- 
tion of modular square roots in the calculation of the run down sequence. While 
a modular square root can be checked, and the calculation repeated if necessary, 
there appears to be no way to detect a false negative result of a Miller-Rabin 
test. Therefore, by using this method we accepted a small but probably positive 
probability of a prime number being declared composite by mistake. 

The following benchmark results were obtained on an 800-MHz Athlon, using 
version 4.1.1 of libgmp.a, version 3.0 of libfftwS.a, and 10®®®® -1-33603 as the 
input number: 

— One call to the GMP function mpz_probab_prime_p with second argument 
equal to 1, which means that one Fermat and one Rabin-Miller pseudo- 
primality test is carried out: 317 seconds user time. 

— A Rabin-Miller test using 2 as base, and using Montgomery modular multi- 
plication [10] and the GMP functions: 149 seconds. 

— A similar program, but using libfftwS.a for multiplications: 56 seconds. 

The advantage of using libfftwS.a could perhaps be reduced if GMP allowed 
for a way to precalculate Fourier and Toom-Cook transforms of frequently used 
factors. It is difficult to predict whether this is sufficient to achieve the speed 
of a multiplication subroutine which is optimized for speed at the expense of 
producing sometimes (albeit rarely) a false result. 

7 The Early Abort Strategy 

The idea behind this strategy is to force the new candidate N' in Step 2. to satisfy 
N/N' > 2^ for some (small) integer value 5 = S{N), with the hope to decrease 
the number of steps and thereby the number of proving steps. Of course, this 
might slow down the search for N' a little bit and some optimization is necessary. 
Yet it appeared critical when used in the primality proof of 10®®®® -1-33603, when 
it was first implemented and used. 

In FM’s implementation, the following value for the lower bound on S{N) = 
b{N) — b{N') was used, where b{x) denotes the number of bits of integer x: 



b{N) < 


1000 


2500 


5000 


7500 


10000 


15000 


CX) 


5{N) 


0 


5 


10 


15 


20 


25 


30 
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The following data were gathered (with the mpi program to be described) . The 
first column contains the time without EAS, the second with. The line Ho stands 
for the time needed to compute the polynomials representing the class field, as 
in [9]; CZ is the time needed for the special variant of the Cantor-Zassenhaus 
algorithm using a trick of Atkin (using roots of unity modulo N); the number of 
steps in the downrun is then given, and the last lines contain the maximal value 
for h, before the mean value. 



Number 




-k 1887 




-k4771 


IQtyyy 22669 


Steps 1-2 


81 h 


58 h 


280 h 


164 h 


716 h 


476 h 


Steps 3-5 


26 h 


26 h 


76 h 


86 h 


209 h 


261 h 


Hd 


1680 s 


4880 s 


4497 s 


7317 s 


3 h 


5 h 


CZ 


22 h 


22 h 


63 h 


75 h 


179 h 


234 h 


# steps 


436 


358 


597 


437 


734 


512 


max h 


1968 


2336 


2184 


2432 


3400 


4000 


h 


86 


116 


120 


164 


152 


272 



The restriction one puts on m and thus on D tends to make D and h larger 
than in the plain case. This can have an impact on the time for computing Hq, 
and also on the proving part. In the first phase, fewer N' are ever tested for 
probable primality, though more must be produced. EAS indeed decreases the 
number of steps, which tends to decrease the total time for the 1st phase, the 
2nd being constant or increasing a little. In any case, a strategy yielding a factor 
of 2 in the total running time is certainly worthwhile. 



8 Some Large Primality Proofs 

8.1 The First Records of fastECPP 

We begin with some data from FM’s implementation that uses MPI on top of his 
ecpp program, and implementing Strategy 1. All computations were done on a 
cluster of 6 biprocessor Xeon at 2.66 MHz. We took the following numbers from 
the tables of P. Leyland*. These are numbers of the form + y^. WCT stands 
for wall clock time and includes the time wasted by the distribution process 
(waiting time of the slaves, typically). The line “Checking” indicates the time 
needed to check the certificate. Note that the time for this should be 0((log N)^) 
and this is compatible with the timings given. 

All numbers were proven in 2003. The “when” line indicates the elapsed 
human dates in big endian notation. 

The first number was dealt with an experimental program that turned out 
to spend too much time in the \/—q* computations. As a matter of fact, a value 
of r = 4000 was used. This led to proceed by chunks of 400 squareroots from 
a total of 4000, adding 400 more if this was not enough. All discriminants with 
D < 10®, h < 6000 (later increased to 8000) and the largest prime factor of h 

* http: //research. microsoft . com/~pleyland/primes/xyyx.htm 
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Table 1. Some large numbers proven with fastECPP. 



X 


2177 


2589 


2551 


2438 


3571 


y 


580 


218 


622 


1995 


648 


#dd 


6016 


6055 


7127 


8046 


10041 


when 


0513-0604 


0606-0617 


0618-0714 


0715-0901 


1001-1220 


# steps 


801 


736 


965 


1128 


947 


Steps 1-2 (CPU) 


164 days 


103 days 


235 days 


355 days 


531 days 


Steps 1-2 (WCT) 


164 X 1.2 


103 X 1.1 


235 X 1.3 


355 X 1.13 


531 X 1.2 




81 days 


30 days 


74 days 


138 days 


204 days 


Steps 3-5 


28 days 


21 days 


55 days 


77 days 


138 days 


Hd 


2951 sec 


1686 sec 


18451 sec 


22552 sec 


20285 sec 


CZ 


26 days 


20 days 


50 days 


69 days 


124 days 


Checking 


25 hours 


22 hours 


45 hours 


70 hours 


85 hours 


max h 


1980 


2080 


3312 


3640 


6176 


h 


121 


103 


190 


209 


409 


maxD 


7,749,263 


19,076,479 


52,396,648 


87,949,348 


95,895,480 



not exceeding 200 were decided to be usable. A look at column 3 compared to 4 
justifies the claim of complexity of 0((log iV)"*). The 8046dd number was done 
after the annoucement of the proof of 10®®®® + 33603 (see next section), and EAS 
was not used for this. The 10041dd number was finished on December 20, 2003, 
well after the one to be described in the next section. This was the first use of 
EAS for this implementation. 

8.2 A New Frontier 

Let us turn our attention to the barrier-breaking number 10®®®® + 33603, whose 
primality was verified by JF, TK with the help of TW. 

The calculations were done using two programs, a pvm program producing 
the sequence of discriminants, group orders and their prime number factorization 
(called a run down sequence in what follows), and the second program calculating 
the elliptic curves. 

The calculation of a run down sequence was started on July 17, 2003 on 
six 900MHz PHI CPUs. On July 21, the computation was moved to 12 nodes 
of parnass2, the LINUX cluster built at the Scientific Computing Institute in 
Bonn. 4 of these nodes had two 800MHz CPUs, the other nodes were double 
PII/400MHz computers. At 8550 digits (on July 30) and 8286 digits (on July 
31) we interrupted these calculations to replace the program by a faster version, 
using the Fastest Fourier Transform in the West in the way explained above. 
This improvement resulted by a speedup by a factor of about 2. On August 5, 
we reached 6574 digits. On August 8, we stopped the program at 3256 digits. 
The final calculations for the run down sequence took about 8 hours on eight 
800MHz CPUs. The total CPU time to produce the run down (i.e., without 
calculating the elliptic curves) was estimated to 234.5 days on a IGHz Pentium. 
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The CPU time spent for the actual certificates is more difficult to estimate, 
since the program for this step was still under development when the calculation 
of the run down sequence started, and since these calculations were done in 
heterogeneous environment. We estimate that it would have taken less than 140 
days on a single 800MHz Athlon CPU. 

The certificate is available as: 

ftp : //ftp .math. uni-bonn. de :pub/people/franke/plOOOO . cert 

9 Conclusions 

We have described some new ideas to speed up practical primality proving of 
large numbers using fastECPP. These ideas need more testing and improvements. 
We hope that this will serve as benchmarks and motivations for the study of other 
primality proving algorithms as well. 



Acknowledgments. Thanks to A. Enge for some useful discussions about this 
work and for his careful reading of the manuscript. The authors of the certificate 
for + 33603 would like to thank the Scientific Computing Institute at Bonn 
University for providing the parallel computer which produced the downrun. One 
of us (J. F.) would also like to thank G. Zumbusch for pointing out the existence 
of libfftw3.a to him. Thanks also to D. Bernstein for sending us his remarks on 
a preliminary version and to the referees that helped clarify some points. 



References 

1. M. Agrawal, N. Kayal, and N. Saxena. PRIMES is in P. Preprint; available at 
http://www.cse.iitk.ac.in/primality.pdf, August 2002. 

2. A. O. L. Atkin and F. Morain. Elliptic curves and primality proving. Math. Comp., 
61(203):29-68, July 1993. 

3. D. J. Bernstein. How to find small factors of integers. June 2002. Available at 
http : //cr . yp .to/papers .html. 

4. A. Enge and E. Morain. Comparing invariants for class fields of imaginary 
quadratic fields. In C. Eieker and D. R. Kohel, editors. Algorithmic Number The- 
ory, volume 2369 of Lecture Notes in Comput. Sci., pages 252-266. Springer- Verlag, 
2002. 5th International Symposium, ANTS-V, Sydney, Australia, July 2002, Pro- 
ceedings. 

5. A. Enge and F. Morain. Fast decomposition of polynomials with known Galois 
group. In M. Fossorier, T. Hpholdt, and A. Poli, editors. Applied Algebra, Al- 
gebraic Algorithms and Error- Correcting Codes, volume 2643 of Lecture Notes in 
Comput. Sci., pages 254-264. Springer- Verlag, 2003. 15th International Sympo- 
sium, AAECC-15, Toulouse, France, May 2003, Proceedings. 

6. A. Enge and R. Schertz. Constructing elliptic curves from modular curves of 
positive genus. Soumis, 2001. 

7. Matteo Frigo and Steven G. Johnson. The fastest Fourier transform in the 
west. Technical Report MIT-LCS-TR-728, Massachusetts Institute of Technology, 
September 1997. 




Proving the Primality of Very Large Numbers with fastECPP 207 



8. Matteo Frigo and Steven G. Johnson. FFTW: An adaptive software architecture for 
the FFT. In Proc. 1998 IEEE Inti. Conf. Acoustics Speech and Signal Processing, 
volume 3, pages 1381-1384. IEEE, 1998. 

9. G. Hanrot and F. Morain. Solvability by radicals from an algorithmic point of 
view. In B. Mourrain, editor, Symbolic and algebraic computation, pages 175-182. 
ACM, 2001. Proceedings ISSAC’2001, London, Ontario. 

10. Peter L. Montgomery. Modular multiplication without trial division. Math. Comp., 
44:519-521, 1985. 

11. F. Morain. Primality proving using elliptic curves: an update. In J. P. Buhler, 
editor. Algorithmic Number Theory, volume 1423 of Lecture Notes in Comput. Sci., 
pages 111-127. Springer- Verlag, 1998. Third International Symposium, ANTS-III, 
Portland, Oregon, June 1998, Proceedings. 

12. F. Morain. Computing the cardinality of CM elliptic curves using torsion points. 
Submitted, October 2002. 

13. F. Morain. Implementing the asymptotically fast version of the elliptic curve pri- 
mality proving algorithm. June 2003. Available at 

http : //www. lix. polytechnique .f r/Labo/Francois .Morain/. 

14. M. Ram Murty. Problems in Analytic Number Theory, volume 206 of Graduate 
Texts in Mathematics. Springer- Verlag, 2001. 




A Low-Memory Parallel Version of Matsuo, 
Chao, and Tsujii’s Algorithm 



Pierrick Gaudry^ and Eric Schost^ 

^ Laboratoire LIX, Ecole polytechnique, 91128 Palaiseau, France 
gaudryOlix . polytechnique . f r 

^ Laboratoire STIX, Ecole polytechnique, 91128 Palaiseau, France 
Eric . Schost@polytechnique . f r 



Abstract. We present an algorithm based on the birthday paradox, 
which is a low-memory parallel counterpart to the algorithm of Matsuo, 
Chao and Tsujii. This algorithm computes the group order of the Ja- 
cobian of a genus 2 curve over a finite field for which the characteristic 
polynomial of the Frobenius endomorphism is known modulo some inte- 
ger. The main tool is a 2-dimensional pseudo-random walk that allows to 
heuristically choose random elements in a 2-dimensional space. We ana- 
lyze the expected running time based on heuristics that we validate by 
computer experiments. Compared with the original algorithm by Mat- 
suo, Chao and Tsujii, we lose a factor of about 3 in running time, but 
the memory requirement drops from several GB to almost nothing. Our 
method is general and can be applied in other contexts to transform a 
baby-step giant-step approach into a low memory algorithm. 



1 Introduction 

Jacobians of small genus curves have now become an important tool for public- 
key cryptography; computing the Zeta function of such curves remains one of 
the central problems in this area. 

For elliptic curves, the question found a first answer with Schoof’s algorithm 
and the subsequent improvements (see [4] and the references therein). Further- 
more, if the characteristic of the definition field is small, p-adic methods initiated 
with Satoh’s algorithm [18] give a tremendous speed-up. 

For higher genus curves, the p-adic methods (based on either Mestre’s [15] 
or Kedlaya’s [12] algorithms) also yield very satisfactory solutions in small char- 
acteristic, both in theory and in practice. However, in large characteristic, the 
question remains delicate. From the theoretical point of view, Pila’s algorithm 
[16] and subsequent improvements [2,10] give polynomial time solutions, follow- 
ing Schoof’s strategy. However, these have been turned into practical algorithms 
for genus 2 curves only [8], and it was only very recently that a Jacobian of 
cryptographic size was counted that way [9]. 

In this paper, we concentrate on the last part of a Schoof-like genus 2 algo- 
rithm. We first recall the basics of such algorithms. 

D.A. Buell (Ed.): ANTS 2004, LNCS 3076, pp. 208-222, 2004. 

© Springer- Verlag Berlin Heidelberg 2004 
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Let us denote by x G Z[T] the characteristic polynomial of the Frobenius 
endomorphism, so that x(l) equals the Jacobian order. A basic task is to compute 
the reduction of x modulo for sufficiently many small primes or prime powers 
i. Such information is obtained through the study of the torsion in the Jacobian; 
if the characteristic is medium, the Cartier-Manin operator [8,6] can also be used. 

Thanks to Weil’s bounds, one can then collect modular information until x 
is known. However, it is far better to switch to a baby-step giant-step (BSGS) 
algorithm before the end, since collecting modular information is costly. For 
instance, in [9], the highest prime I was 19, whereas without the BSGS phase, 
it would have been 59. 

If x(l) is known modulo m, then using standard BSGS techniques, the time 
necessary to compute the Jacobian order varies like In 2002, Matsuo, 

Ghao and Tsujii [14] showed that if we use not only the value x(l) modulo m, 
but rather all coefficients of x modulo m, it is possible to speed-up the BSGS 
computation. The runtime of their method (called MGT in the following) varies 
like 1/rn, which is an important improvement. The main drawback is the space 
complexity: the largest example shown in [14] used 12 GB of central memory, 
whereas the runtime (5 days on a single processor) was quite reasonable. 

Standard BSGS techniques have low-memory, parallelizable, probabilistic 
counterparts, based on the rho or lambda (kangaroo) methods of Pollard’s [17]: 
such techniques were presented in [24,22,23,8,21]. In this paper, we apply the 
ideas of Matsuo tt al. to improve the variant of [8]; we obtain a probabilistic 
algorithm, with a heuristic complexity analysis, but which requires almost no 
memory and is immediately parallelizable. The expected running time is about 
3 times the running time of the MGT algorithm. 

In order to simplify the exposition, we concentrate on genus 2 curves. How- 
ever, our idea works for more general curves as soon as the characteristic poly- 
nomial X is known modulo some integer m. Another application of our method 
is the algorithm of [3] for Picard curves, which is a BSGS type algorithm. 

The paper is organiz ed as follows. Section 2 defines the necessary notation 
and recalls the BSGS algorithm of [14]. Then, to introduce methods based on 
the birthday paradox, we start in Section 3 by a special case, where the modular 
information is complete enough, so we can use the method of [8]. In Section 4, 
we deal with the general case, and introduce bidimensional analogues of these 
techniques. Section 5 finally presents our experimental results. 



2 MCT Algorithm for Genus 2 Hyperelliptic Curves 

2.1 Characteristic Polynomial of Ftobenius Endomorphism 

Let C be a genus 2 curve, defined over the finite field Fg with q elements. The 
Jacobian group of C is denoted by J(C) and the characteristic polynomial of the 
Frobenius endomorphism is denoted by This polynomial has the form 

X(T) = - siT3 + s 2T^ - qsiT + q\ 
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where si and S2 are integers. The group order of the Jacobian is then given by 

^J{C) = x(l) = + 1 — Si{q + 1) + S 2 - 

By Weil’s theorem, the roots of \ have absolute value and bounds on si and 
S2 follow directly. Better bounds can be found using the fact that the roots of x 
come in pairs of complex conjugates [ 13 , Proposition 7 . 1 ]: 

g2 

|si|<4yg, 2|si|y/g- 2(? < S2 < ^ + 2g. (1) 

The values of (si, S2) satisfying these bounds form the hatched zone in Figure 1 . 




Fig. 1. Bounds on si and S 2 

2.2 Review of the MCT Algorithm 

In large characteristic, point counting algorithms work by collecting modular 
information; this information is then recombined using the Chinese Remainder 
Theorem and the work is finished using some BSGS strategy. We now describe 
the MCT algorithm for this last step. 

From now on, we assume that the characteristic polynomial x is known mod- 
ulo some positive integer m, i.e. si and S2 are known modulo m. Therefore we 
introduce new variables for the known and unknown parts of si and S2'- 

Si = FT -k msi and S2=^-l-ms2, 

so that our goal is now to find sj and S2- To this effect, a random divisor D is 
first picked in J(C): the strategy is to compute the order of D, hoping that it is 
large enough to be able to conclude (the case when J(C) is highly non-cyclic is 
rare in practice and easily tackled). 

The order of D divides the group order x(l)> therefore we have the equality 
(9^ -k 1 — Si(<7 -k 1 ) “k S2) • D -k ( — Sii^q -k 1 ) “k S2) ■ Tfl ■ D = 0 . 
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To obtain a BSGS algorithm, it is usual to separate the two unknowns, one on 
each side of the equation. Here the unknowns ^l,S 2 lie in intervals that are of 
different sizes, therefore it is necessary to split again S 2 into two pieces. Let n 
be a parameter to be fixed later, and let us write 52 = ^2 + nu 2 with 0 < t 2 < n. 
Then x(l) • = 0 rewrites 

+ 1 — sl(g + 1) + ^ + m{—s{{q + 1) + nu 2 )) ■ D = —t 2 m ■ D. 

The algorithm proceeds as follows: first, all possible values for the right-hand 
side are computed and stored in a data-structure in which searching is fast. Then 
the left-hand side is computed for all possible values of si and U 2 , until a match 
is found with a value of the right-hand side. For each value of si, the bounds of 
Equation (1) are used to find the range of the possible values for U 2 ] a precise 
study of the area of the space of search leads to an optimal value n « jm, 
yielding a running time of about operations in J(C). 

3 Preliminaries: The Case m > 

The main drawback of the previous method is the storage requirement. We now 
introduce low-memory variants and start by describing a special case, for which 
most features of the general treatment are already present. 

Suppose that x is known modulo m, with m > 8y^. Because |si| < < 

m/2, the coefficient Si is known exactly. Corresponding to this value of Si, 
we have bounds for S 2 which yield bounds on s/, which is the only remaining 
unknown. We are thus in the setting of the search of a group order in a bounded 
arithmetic progression. Among the many variants inspired by Pollard’s lambda 
(or kangaroos) method that yield a low-memory solution to this question, we 
chose the one based on the birthday paradox described in [8]. We recall it here 
briefly; detailed descriptions and analyses of several other variants can be found 
in [17,24,22,23,21]. 



3.1 Random Search in Two Intersecting Intervals 

Let K = < 7 ^ -I- 1 — si (<7 -I- 1) + ^. Then the group order can be written #J(C) = 
K ms 2 , where K is known and S 2 is unknown. Performing a suitable shift, it 
is possible to adjust K by some multiple of m so that the bounds on S 2 become 
symmetric, i.e. such that js^j < B for some integer B. 

As before, we pick at random an element D in J(C), of presumably large 
order. We then define two sets of divisors that we denote W and T: 

W = {{K + ma2) ■ D ; <J2 G [~B, B]} , T = {ma2 ■ D ; CT2 G [-B, B]} . 

By definition W and T intersect, since the zero divisor is in both of them, by 
taking (J 2 = s/ for W and CT 2 = 0 for T. More precisely, the size of the intersection 
is #(lFnT) G [B -I- 1, 2B + 1], depending on how far S 2 is from 0. The algorithm 
then proceeds as follows: we pick random elements uniformly alternatively in W 
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and T. If the same divisor is obtained as an element of both W and T then (a 
multiple of) the order of D can be deduced. 

Assuming that picking a random element in kb or T has unit cost, that 
all elements are stored in table, and that a search in the table has unit cost, 
then by the birthday paradox, a collision can be obtained in expected time and 
space The constant in the 0{ ) can actually be made more explicit. 

First, for fixed S 2 , the expected running time grows like , where 

7 = FI T)/{2B + 1) = {2B + 1 — |^|)/(2i? + 1). Thus, assuming that 

S 2 is uniformly distributed in [—B,B], the expected number of operations is 
asymptotic to 



1 I {2B+1)7t I 

2B+iJ_b^ 7 ( 52 ) J-b\2B+1-\s2\ 

which itself is asymptotic to 2(2 — \/2)'/27iB « 2.07 y/2B. Though, one should 
note that even if C is randomly chosen uniformly among all curves, the num- 
ber of points of the Jacobian is not uniformly distributed, as the concentration 
is slightly higher in the middle of the Hasse-Weil interval; therefore S 2 is not 
uniformly distributed. 



3.2 Pseudo- Random Walk and Distinguished Points 

We now address the main issues raised in the above algorithm: the generation 
of random elements in W and T and their storage. The key to the answer is to 
replace randomness by a deterministic pseudo-random walk. 

Let r > 0 and i be parameters to be fixed later. The pseudo-random walk 
is initialized as follows: for all k in [l,r], an offset Ok G J(C) is precomputed 
as Ok = OfcTO • D, where ak is a random integer in [0,2£], We also need a hash 
function TL that maps elements of J(C) to [l,r]; this hash function should have 
good statistical properties, but no cryptographically strong property, like one- 
wayness, is required. Typically, "H is obtained by taking a few bits in the internal 
representation of the elements and the integer obtained this way is taken modulo 
r. Then, starting with an element P in IF (resp. in T) for which we know the 
corresponding 172 , we define another point Q hy Q = P + O-kL(p)- 

Assuming that £ is not too large compared to B, with high probability the 
point Q is still an element of W (resp. of T). Furthermore, the value of (T 2 
corresponding to Q is obtained by adding a-u(p) to the value of 02 for P. 

Iterating this process yields a chain of pseudo-randomly chosen elements in 
W (resp. in T). However, the chain should not be too long, to keep the probability 
to go out of the domain moderate. Thus the average length £ of an offset must 
be adapted according to the average number of steps we expect to do in one 
chain (see below). With this device it is then possible to produce each new 
pseudo-random element in IF or T for one operation in J(C). 

We now deal with storage requirements. To this effect, we introduce the con- 
cept of distinguished points, originally appeared in [7]. We say that an element 
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of J(C) is a distinguished point if its image by a second hash function is 0. Again, 
we do not ask much of this hash function, except that its behavior looks inde- 
pendent of any arithmetic property of the element, and that the probability of 
being distinguished can be effectively estimated and tuned to a prescribed value. 
As usual, looking at some bits in the internal representation of the elements is 
a good way of doing. We denote by px> the probability for an element of being 
distinguished. 

The algorithm then proceeds as follows: starting with a random point alter- 
natively in W and T, we produce pseudo-random elements using the pseudo- 
random walk, until a distinguished point is hit. Then this point is stored and 
another chain is started. The length of the chains is 1 /pv on average, and the 
parameter £ should be tuned accordingly. If all the parameters are well chosen, 
then after (say) 1000 chains on average, we have produced enough points to ex- 
pect a collision. If this occurs at a point that is not distinguished, then the two 
chains continue on the same track because the pseudo-random walk is determin- 
istic; the two chains will end at the same distinguished point, thus allowing the 
detection and the solution of the problem. 

Many practical experiments were made to test this strategy against the ide- 
alized algorithm described above. They are very satisfactory and in [ 22 ] it was 
suggested that taking r = 20 is enough to simulate a random walk in this context. 

As a conclusion, in the case when m > S^/q, the number of points can 
be computed using a parallel low-memory algorithm that requires on average 
0( V^g/m) operations in J(C). This is worse than the 0 {q^^^jm) complexity 
announced in [ 14 ], but their complexity analysis implicitly excludes that case. 



4 The General Case 

In the case when m < S^/q, there are several choices for si and still many 
more for S2. We could loop over all the possible values of si and look for a 
corresponding S2 using the algorithm of the previous section, but this approach 
has worse complexity than the MCT algorithm. The workaround is to take into 
account the bidimensional nature of the problem. 

Recall that we have si = Vf + msi and S2 = S2 + ms2, where sf and S2 
are known integers in [ 0 ,m — 1 ], the goal being to find si and S2- From the 
bounds ( 1 ) on si and S2, we deduce similar bounds for and S2. In order to 
make the description more generic, we write these bounds in the following form: 

.Rl,min ^ ^ .Rl,max5 .R2,min ^ -52 ^ .^2, max- 

4.1 Random Search in Two Intersecting Rectangles 

Let D he & random element of J(C); as previously, we assume that the order of 
D is large enough compared to the group order. Since x(l) ■ -D = Oj we have 

(9^ + 1 — si(<7 + 1 ) + S2) • D -\- ( — Si (<7 + 1 ) + S2) • m ■ D = 0 . 
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Let K he + 1 — '^{q + 1) + which is a known integer; then we have to find 
Si and S2 such that K ■ D + {—si{q + 1 ) + s^) m ■ D = Q. Let i? be a rectangle 
containing the possible values for the pair (si, ^): 

R — {(^17^2)! G [-^1, mini -^1, max] j CT2 [-S2^minj L?2jmax] } • 

Since Bi^min (resp. i?2,min) is not necessarily the opposite of i?i,max (resp. 
-B2,max)> we have to normalize the situation. We define a value K' to be used in 
place of K, so as to center our search: 

K' = K + m 

We can now define two sets of points: 

W = {K' ■ D + {—ai{q + 1 ) + a2) ■ m ■ D-, (cri, (T2) G R} , 

T = {{-ai{q+ 1) + CT2) • m • D; ((Ti,cr2) G R} ■ 

We assume that these two sets have cardinality exactly This may not hold 
in general, but is true with the further assumptions that D is of large order and 
that m is larger than 8. 

By construction, the sets W and T have a non-trivial intersection. Let Dw 
be in W and Dt in T. We write {( 7 iw,cr 2 w) the values corresponding to Dw 
and (criT,CT2T) the values corresponding to Dt- Then, assuming again that the 
order of D is large enough, Dw = Dt if and only if 



B 



l,min 



B 



l,max 



(9+1) 



52,min + ^2,, 



<J\W efiT — 5i L(.Bi min “t” Bi^max)/2j , 

CT 2 IV — <J2T = S2 — [(B2,min + B2,max)/2j . (2) 



Hence it is easily checked that 



N = ^{Wr\T) G 




When Bi^min = --Bymax and H2,min = -.62, max, we get the picture of Figure 2. 

A first version of our algorithm now proceeds as follows: random elements 
of W and T are constructed by picking random elements in R. These elements 
are stored in a data structure in which it is possible to detect quickly collisions 
between an element of W and an element of T. Together with these elements, 
we also store the corresponding pair (cti,(T 2). On average, due to the birthday 
paradox, a collision occurs after having constructed 0 {'/N) elements of W and 
T; taking the difference of the pairs ctt and aw then gives the result by Equa- 
tions (2). 

Since the bounds on si and S2 yield |^l| = Oi^^Jqjm) and |s2| = 0 {qlm), we 
have N = 0 {q^^'^ jm^), and therefore the expected number of points to construct 
is in Just as in the unidimensional case, we now give an estimate of 

the constant hidden in the 0 {), using simplifying assumptions. 




A Low-Memory Parallel Version of Matsuo, Chao, and Tsnjii’s Algorithm 



215 




Fig. 2 . Intersection of W and T 



For such estimates, we can assume that the problem is centered, and therefore 
we put Si = L?i^max — and B 2 = -S 2 ^max — -^ 2 , min! hence = 

(2i?i -I- 1 )( 2 _B 2 + 1). We denote by 7 G [j,l] the ratio #(W fl T)/^R\ this 
parameter is easily computed as: 

... (2Bi + 1-|s~i|)(2B2 + 1-|s~2|) 

7 («,«) = ^ . 

From the birthday paradox, we see that the expected number of elements that 
have to be created before a collision between an element of W and an element 
of T occurs is asymptotic to 

We now assume that si and S 2 are uniformly distributed. Then the average 
number of points to construct grows like 



1 

^R is7=-Bi 





dsi ds2- 



This integral is easily computed and shows that the expected number of points 
to construct is asymptotic to 8^/n{S — 2y/2) \/#i? « 2.43 yffR- Now, bounds (1) 
on Si and S 2 yield approximate bounds for si and S 2 : 



.^l.max .^2, max .^2, min — Sq/tTI^ 

hence =ffR = Irn^ . The approximate value of 2.43 then yields a running 

time of about 19.5g^/‘*/m operations in J(C). Hence, we see that a constant 
factor of about 5 is lost compared to the original (memory consuming) MCT 
algorithm. This difference is partly due to the fact that in the MCT algorithm, 
the BSCS approach allows to search only in the area described in Figure 1, 
whereas ours does not take this specificity into account. 

Note that this analysis is idealized in several places: first, the pseudo-random 
walk that will be used below is not purely random and can cause some discrepan- 
cies. Next, the assumption that si and S 2 are uniformly distributed in a rectangle 
is actually wrong. However, we expect that this should still give a good estimate. 
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4.2 A Bidimensional Pseudo-Random Walk 

The main questions are now the generation of random elements in W and T, 
and how to avoid storing them. These are the same issues as in Section 3 and 
similar techniques will be applied to solve them. The definition of distinguished 
points still makes sense in this new context, and will be used with no modifica- 
tion. However, the previous pseudo-random walk needs to be converted into a 
bidimensional one. 

Let r, £i and £2 be parameters to be fixed later: r controls the number of 
offsets that we are going to precompute, and £\ and £2 are the average lengths of 
the horizontal and vertical offsets. For each k and k' in [1, r], we select a random 
non-negative integer ak,k' uniformly in [0, 2£i] and a random non-negative inte- 
ger l3k,k' uniformly in [0, 2 ^ 2 ] • Then for each k and k' in [1, r] and b in {0, 1}, we 
compute and store the offsets Ok,k',b = {—^)'^oik,k'{q+^)mD + f}k,k''mD G J(C), 
where D is the base point whose order is to be computed. 

Starting with a point P in W (resp. in T) for which we know the corre- 
sponding pair (cti,CT 2 ), we define another point as follows. We compute fc, k' , 
and b as pseudo-random deterministic functions of P, by using some hash func- 
tions. Then we define Q = P + Ok,k',b- If £1 and £2 are small enough, with high 
probability Q is still in W (resp. in T) and the corresponding pair is given by 
(cti — (— CT2 + f3k,k')- Iterating this process allows us to produce chains of 
pseudo-randomly chosen elements in W (resp. in T). As before, the chains can 
not be too long, otherwise they go out of W (resp. of T). The cost of producing 
one element is one group operation. 

Note that we have only used positive values for the offset in the second 
direction: this is intended to reduce the chance of creating cycles. However, 
experiments with alternative strategies turned out to yield similar results. 

4.3 Setting the Parameters 

Let A be such that the expected number of points to construct is X^/^fR and 
let C be the number of chains we expect to construct. Note that C is fixed by 
the user; it should be large enough so that averaging considerations make sense 
(say C > 1000), and small enough so that the cost of initializing a chain is 
negligible compared to the cost of the steps that are done in that chain. Also 
the number of chains is essentially the number of distinguished points that have 
to be stored and therefore should be small enough. Knowing an approximation 
of A and having fixed C, the probability of being distinguished follows from 

Now we fix £i and £2] to control the probability of going out of W (or T), 
we first evaluate the average length of a chain. There are about 1/pv steps and 
each step goes on average £1 in the first direction and £2 in the second direction. 
In the second direction, all the offsets are positive and therefore the length of 
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a chain is ^ 2 /pi) on average. We want this to be small enough compared to the 
size of R in that direction, say one tenth of i? 2 ,max — S 2 ,min: 

P (.82, max — .82, min) Pi) 

10 • 

In the first direction, the situation is different since the offsets can be positive or 
negative, but still with an average absolute value of £ 1 . The central limit theo- 
rem applied to the 1-dimensional random walks gives that the average distance 
to the origin after 1 /pd steps is about 2^/2j^ iij the factor 2/-\/3 corre- 
sponds to the standard deviation of the lengths of the offsets. For convenience, 
we approximate 2-\/2/37r by 9/10. Again, imposing that this value is one tenth 
of the size of R in that direction yields: 

^ (.81 niax .81 niin) ^/PT> 

9 ■ 

It may happen that £\ or £2 is very small, and even smaller than 1. This is 
especially the case when the bounds on s/ and S 2 have a different order of 
magnitude. Unfortunately this is the case in the genus 2 point counting case, 
where S 2 is on average about times larger than s/. A solution would be to 
enlarge C, but this is not satisfactory since it implies more storage. 

A better choice is to modify the random walk as follows. Assume that £\ is 
small {£\ and £2 can not be simultaneously small). Then with probability p, we 
add either the same offset Ok,k',b as before or a modified offset Ok,k' = l^k,k''mD 
which does not include a progression in the first direction. The probability p is 
fixed so that the apparent mean value of £\ is the one we wanted. Note that the 
decision of adding Ok,k’,b or Ok,k' is a deterministic choice that depends on D. 



4.4 Reducing the Search Space 

In order to have a better understanding of the distribution of (si,S 2 )) we ran 
some statistics. Note that similar statistics, supported by theoretical and heuris- 
tic considerations were done in [20], but with the purpose of finding the mean 
value of the class number. 

Here, for p = 10® -I- 3, we randomly selected 10, 000 monic squarefree polyno- 
mials of degree 5 over Fp and computed the (si, S 2 ) values for the corresponding 
curves. As expected, the pairs (si, S 2 ) tend to be not too close to the borders of 
the domain. In Figure 3, we represented the domain where (si,S 2 ) are allowed 
to stay according to bounds (1), and inside the domain, the darkness of a point 
means that the density of pairs (si, S 2 ) is high. 

On the picture, we see that there are very few pairs for which S 2 is large, 
because the “wings” are very thin. In fact, these points correspond to curves 
which are close to maximal curves, and it is no surprise that they are rare. More 
precisely, in our tests, the proportion of curves for which S 2 is larger than 3g is 
about 2.2% and the proportion of curves for which S 2 is larger than Aq is about 
0.23% (remember that in theory, S 2 can be as large as 6q). 
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Fig. 3. Statistics on (si,S 2 ) 



In view of these results, in order to improve our constant 19.5, and remarking 
that there is no point in spending too much time for large values of S 2 , we will 
restrict the rectangle R to the following bounds: 

i? = {(cti,(T 2 ); (Ti G [-2.5y^/m,2.5v^/TO], ct 2 G [-2(j/m, 3g/m]} . 

In our statistics, for more than 97% of the curves, S 2 < 3q/m, and therefore 
the overlapping factor of W and T is at least 1/4; and for about 99.7% of the 
curves, S 2 < dg/m and therefore the overlapping factor is at least ^ = 0.12. The 
overlapping decreases as the point (si, S 2 ) gets closer to the end of the “wings” 
of the arrow on the picture and if S 2 > 5.5 q, the sets W and T do not overlap. 

With this strategy, the area of R is reduced to 25(7^/^/m^ so that the ex- 
pected runtime is about 12 q^^^jm group operations. Therefore we lose “only” 
a factor of 3 compared to the original MCT algorithm; this strategy is used in 
the experiments presented below. 

We expect that this is very unlikely to get curves with si, S 2 outside of the 
above rectangle R by random constructions. In case the algorithm does not find 
an answer after say 10 times the above expected time it could pay to start a 
classical MCT algorithm to search deterministically in the ends of the wings if 
we have enough memory for this much smaller subproblem. Otherwise, another 
method is to start a chain corresponding to these wings in the area outside the 
rectangle with a small probability, so that we do not perturb the average runtime 
but we can guarantee that the program finishes. 



5 Practical Experiments 

We did two kinds of experiments: first with a high level implementation using 
the Magma computer algebra system [5], we ran some simulations to check the 
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validity of our approach in various situations. Then, we wrote a parallel C-I-+ 
implementation of our algorithm using the NTL [19] and the MPICH [1] libraries, 
in order to run tests with real sized curves. In our experiments, we used the 
reduced search space given in Section 4.4, with jw?. 

5.1 Simulations 

In order to test the validity of our heuristics on sufficiently many examples of 
reasonable size, we used the following simulated algorithm. Let p be an integer. 
We pick two integers Si and S 2 at random uniformly in the area of Figure 1. Then 
we form the integer N = -I- I — si(p -I- I) -I- S 2 and work in the group 

instead of J(C). An integer m is chosen, and using only p and si,S 2 modulo 
m, our goal is to recover N . Thus, we can construct appropriate examples with 
different values for the parameters at almost no cost. 

We chose several pairs (p, m) yielding always « 5 x 10^°, so that px> can 
be taken to be 2“®, and with varying from O.OI to 21. For each such pair 
(p, to), IOO random pairs (si,S 2 ) where tested and the average number of steps 
is measured and compared to p^/^/to. The results are reported in Table I. 



Table 1. Simulations with cyclic groups Z/VZ. Each line corresponds to 100 runs. 



p 


m 


h 


£2 


Avg ratio nb jumps / 


5.7 X lO' 


14 


21 


8,040 


12.7 


9.2 X 10® 


118 


10 


15,250 


12.7 


1.5 X 10^° 


949 


5 


30,350 


12.5 


1.5 X 101"^ 


953674 


0.5 


304, 000 


12.6 


1.2 X 10^'^ 


144675925 


0.1 


1,621,271 


14.3 


1.9 X 10^® 


1157407407 


0.05 


3, 242, 542 


14.5 


2.5 X 10^1 


250000000000 


0.01 


19,455,252 


22.2 



Our conclusion is that even if the rectangle R is very thin (z. e. is tiny), then 
the measured running time is quite close to the predictions. In the extreme, the 
last case ^ = 0.01 corresponds to a case where to « 4^, therefore Si,max— 
is just 2. In that case, the heuristics we used hardly make sense, but this is a good 
surprise that the average running time is still within a factor 2 of the heuristic 
analysis. 

5.2 Several Runs on the Same Curve 

We ran our software implementation many times on a given curve, in order to 
check that the average measured running time is close to the heuristic estimate, 
for a medium sized problem. 

Let p = 5 X 10^^ -I- 8503491 and / a random monic squarefree polynomial of 
degree 5 over Fp. Using the Schoof-like method described in [9] we have deduced 
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the values of si and S 2 modulo m = 44696171520. The curve is such that sij ^Jp « 
—0.84 and s^/p ~ 0.38, therefore it is not close to any border of the bounds. 

With the MCT algorithm, the computation is feasible and requires about 
1.5 GB of memory. We ran our software 100 times on that input, and for each 
run we measured the number of operations in the Jacobian. The average value 
is 11.27p^/‘^/m, which is in accordance with our estimates. The minimal value 
is 2.69p^/^/m and the maximal value is 31.55p^^"'^/m. In Figure 4 we give the 
histogram for the number of runs whose running time is in a given range. 




running time 



Fig. 4. Histogram showing the number of runs whose running time is in the given range 
(divided by 



5.3 A Larger Example 

In order to test the scalability of our method, we ran a larger example. Let 
p = 5 X 10^"^ + 8503491 as above, and / a random monic squarefree polynomial 
of degree 5 over Fp. We now suppose that si and S 2 are known modulo only 
m = 1655413760 (note that the modulus 44696171520 mentioned above equals 
27 X 1655413760). 

With this more restricted information, we cannot conclude using the original 
MCT algorithm: the computation would require to store about 5 x 10® points 
together with their indices. Even using hash tables, it seems difficult to use less 
than 8 bytes for each entry, which means at least 40 GB of memory. 

We ran our algorithm on that input. The probability of being distinguished 
was set to = 2“®'^. The program was run in parallel on a cluster of 24 Pen- 
tium IV at 3 GHz. After 4 hours of computation, 544 distinguished points were 
computed and a useful collision occurred. About 9 x 10® steps were performed: 
this is a “lucky” run, since this is about 4.3p^/‘^/m. The memory requirement 
was about 1.5 MB on each node, mostly for the executable code, not the data. 
The amount of communication is reduced to a few KB between the nodes and a 
“master” node. 

For that curve, Sijy/p « —1.16 and S 2 /P ~ 1.25, so the curve was not 
exceptional, in the sense that it was not too close to the border of the area of 
Figure 1. 
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6 Curves of Higher Genus 

The MCT algorithm can be extended to higher genus curves for which the char- 
acteristic polynomial of the Frobenius endomorphism is known modulo some 
integer m. In [11], different variants of this extension are studied and compared. 
Our extension of the original genus 2 MCT algorithm also applies to all these 
variants with a few modifications. Indeed, the number k of unknown coefficients 
in the characteristic polynomial can be larger than 2 and then a /c-dimensional 
random walk should be designed in order to search for a collision in the inter- 
section of two fc-dimensional boxes. It is a tedious task to fill in the details, in 
particular the expected constants hidden in the 0( ), but for fixed genus, we 
certainly lose only a constant factor compared to the memory-costly algorithms. 

Another range of application of our method is the BSGS algorithm developed 
in [3] for counting points of the Jacobians of Picard curves. Without giving many 
details, let us mention that their algorithm ends by the search for a collision in 
two arithmetic progressions; therefore our algorithm applies almost directly to 
that case and should dramatically reduce the space complexity, which is the 
main drawback of their method. 



7 Conclusion 

We have presented a low-memory, parallelizable analogue of the algorithm by 
Matsuo, Chao and Tsujii for genus 2 hyperelliptic curves; the method works in 
other cases, such as the BSGS algorithm of [3] for Picard curves. 

The main tool we used is a bidimensional pseudo-random walk. As usual 
with this kind of algorithms, it is impossible to make a rigorous analysis and 
heuristics and computer experiments are necessary to validate the approach. 
Our numerical data are positive in that sense, therefore our algorithm can be 
used when the memory constraint becomes problematic. 
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Abstract. In this paper we investigate the efficiency of the function field 
sieve to compute discrete logarithms in the finite fields Fan . Motivated 
by attacks on identity based encryption systems using supersingular el- 
liptic curves, we pay special attention to the case where n is composite. 
This allows us to represent the function field over different base fields. 
Practical experiments appear to show that a function field over F 3 gives 
the best results. 



1 Introduction 

Research into the discrete logarithm problem (DLP) in finite fields has typi- 
cally only focused on characteristic two and large prime fields, because these 
offer the greatest efficiency for implementations. However, since the proposal of 
the first fully functional identity based encryption scheme (IBE) by Boneh and 
Franklin [4], characteristic three fields have gained a significant cryptographic 
interest. These fields allow optimal security parameters with reduced bandwidth 
in the context of supersingular elliptic curves and pairings [9], used by concrete 
IBE implementations. 

First suggested by Shamir [22] in 1984, the concept of identity based cryptog- 
raphy has been an attractive target for researchers because it has the potential 
to massively reduce the complexity of current PKI systems. By using the notion 
of identity as a users public key, the amount of infrastructure required is greatly 
reduced since a message sender implicitly knows the public key of the recipi- 
ent. An often used example is that of an email address. Within this context, an 
identity for Alice might be the string 

aliceOhotmail . com. 

D.A. Buell (Ed.): ANTS 2004, LNCS 3076, pp. 223-234, 2004. 
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If Bob wants to send Alice secure email, he implicitly knows her email address 
and hence her identity and public key. Therefore he can encrypt the email to her 
without the same level of involvement from, for example, certificate and trust 
authorities. 

Since the paper of Boneh and Franklin [4] a large body of work on efficient 
implementation of the underlying algorithms [5,10] and arithmetic in both soft- 
ware [13] and hardware [6,19], has made such applications viable. Almost all 
these implementations have made use of elliptic curves defined over finite fields 
of characteristic three. The security of these IBE implementations is therefore 
dependent on the difficulty of solving the discrete logarithm problem both on an 
elliptic curve over a field of characteristic three and in a finite field of character- 
istic three. 

Algorithms for discrete logarithm problems on elliptic curves do not depend, 
essentially, on the underlying finite field. Hence, their behaviour is well under- 
stood. The same cannot be said for discrete logarithm algorithms over finite 
fields. Such algorithms depend quite closely on the characteristic. Indeed differ- 
ent algorithms are chosen for different size characteristics. 

The best known algorithm for solving the DLP in fields F^n of small charac- 
teristic is currently the Function Field Sieve (FFS). First developed by Adleman 
[1] and Adleman and Huang [2] in 1994, the FFS may be regarded as an exten- 
sion of Coppersmith’s earlier discrete logarithm algorithm for characteristic two 
fields [7,8]. Initially the algorithm applied to small fields subject to the condition 
p® < n as p" — >■ oo, where p" is the size of the finite field. Schirokauer [21] then 
extended the method to relax the condition to p < n°'^. Asymptotically it has 
complexity 

V[l/3,(32/9)i] =exp(((32/9)5 +o(l))log(p")ilog(log(p"))i) . 

More recently, Joux and Lercier [14] provided a slightly more practical algorithm, 
which chooses the initial function field and resulting polynomials more efficiently. 
It is this latter method that we adopt for our implementation making minor 
changes to facilitate the use of superelliptic curves which provide some efficiency 
benefits. 

In the past discrete logarithm computations in finite fields of small character- 
istic have been carried out only in characteristic two, see for example [11,24,14]. 
In each of these works the extension degree was prime, since the underlying pro- 
tocols being attacked were assumed to come from traditional discrete logarithm 
based systems in which one almost always selects prime extension degrees. 

The fields of characteristic three arising in identity based encryption sys- 
tems are all of composite extension degree. The so called MOV embedding is 
usually chosen to be six, see [18,9]. We then have the option to apply the FFS 
method relative to any subfield. A theoretical analysis indicates that in this case, 
some representations may be more amenable to attack than others [12], and we 
investigate these possibilities also. 

The remainder of this paper is organized as follows: in Section 2 we recall the 
mathematical background of the function field sieve and in Section 3 we provide 
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a detailed description of our implementation. In Section 4 we report on several 
practical experiments which investigate the efficiency of the function field sieve 
in characteristic three and in Section 5 we present some conclusions. 



2 The Function Field Sieve 

The finite fields of interest to pairing based cryptography in characteristic three 
are given by IK = F 3 >i where n = 6-m. In what follows we set q = S’”. In practice, 
we limit ourselves to finite fields which could arise as the MOV embedding of a 
supersingular elliptic curve over the field Fg whose group order is divisible by a 
prime I of size comparable to that of q. Such elliptic curves have group orders 
given by 

q± a/3(7+ 1- 

We wish to investigate not only the practicality of the function field sieve for 
the field extension Fa^/Fa, but also the effect of taking different base fields, 
i.e. looking at the extension Fa»»/Fae where e = 1,2,3 or 6. To this end we let 
k = Fae, N = nje and p = 3®. 

We assume we are given a, /3 G K, both of order I, such that 

P = a^ 

for some unknown a;G{l,...,Z— 1}. The discrete logarithm problem then is to 
compute X given a and /?. 

We will use a function field F = k{X)\Y]/ {F[) defined by the polynomial 
H{X,Y) G k\X,Y] over the rational function field k{X). Note, we shall abuse 
notation slightly and refer to the polynomial H(X, Y) as the curve H, by which 
we mean the curve defined by H{X, Y) = 0. For practical reasons [17], one usu- 
ally selects a Cat-curve for H. However, in our examples we used a superelliptic 
curve of the form 



H{X,Y) = Y^^ + RiX) 

where R{X) is a polynomial in k[X] of degree b. This enables more efficient 
calculation of the functions on the algebraic side, at the expense of a little 
less generality of our implementation. The class number of the function field F 
defined by H, or equivalently the number of points on the Jacobian of the curve 
Jh defined by H over k, we shall denote by h = ^Jnik). 

We assume that the field IK is defined by a polynomial basis with respect to 
the polynomial / G k[X] of degree N, i.e. 

K^k[X]/{f). 

To define the function field sieve algorithm we need to specify two polynomials 
Ui,U 2 G k[X] such that the norm of the function U 1 + U 2 Y, given by the resultant 

U2H{X, —U1/U2) = { — + U2R 
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is divisible by /. In such a situation we have a surjective homomorphism 

(k[X,Y]/{H)^K^k[X]/{f) 

' \ Y !->• —U\ju2- 

We select a rational factorbase TZ of small degree irreducible polynomials in 
k[X] and an algebraic factorbase A of small prime divisors in the divisor group 
Div(F). Hence, TZ and A are therefore defined by 

= {p : deg p < _B, p irreducible } , 

A = {(p,h^ — r) : degp < -B,p irreducible , r = —u\ju<i (mod p)} , 

for some smoothness bound B. 

The goal of the function field sieve is to find relatively prime pairs of polyno- 
mials (r, s), with deg r, deg s < I, such that the polynomial {su 2 — ru\) and the 
divisor (s -|- rY) simultaneously factor over the respective factorbases, i.e. 

SU 2 -rui= pf 
pisn 

{s + rY)= bj{pj,Y-Vj). 

(pj,vj)eA 



Determining the factorization of (s -I- rY) over A is easily done by examining 
the factorization of the norm 

N{{s + rY)) = {-s)‘^ + r‘^R. 

Since h {pj,Y — ij) is a principal divisor, for each {pj,Y — Xj) G A there exists 
a function Aj G F* with h {pj,Y — Xj) = (Xj) and such a function is unique up 
to multiplication by an element in k*. We then have that our algebraic relation 
is given by 



(s + ry)" = /rnA^ 

with fx an element in k* . We then apply the homomorphism (j) above to obtain 

U2 

^3 

where = denotes equality modulo a possible factor in k* . Assuming h is coprime 
to {p^ — l)/(p — 1), we can take h-th roots of both sides of this equation and if 
we write Kj = we obtain 
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Combining the relation on the rational side with the above equation we find 



1 

U2 



rip“' 

Pten 




Hence, we obtain the relation between discrete logarithms given by 



Qi logg p* - logg {u2) = Y^j logs 1 ) 

Pi ij 



where g is a multiplicative generator of the field K. Note that we do not need at 
any point to compute the values of the Kj, or for that matter the values of the 
Aj, all that we require is that they exist. This is guaranteed by the condition 
that h should be coprime to {p^ — l)/(p — 1). 

If sufficiently many independent relations of the form (1) have been obtained, 
we can solve for the discrete logarithms themselves using structured Gaussian 
elimination combined with the Lanczos method. Determining the discrete loga- 
rithm of /3 with respect to a is then performed using a standard recursive sieving 
strategy as explained in [21]. 



3 Choice of Parameters and Implementation Details 

The various parameters of the function field sieve, namely the size d of the 
function field extension, the size B of the largest factorbase element and the size 
I of the polynomials r and s, are approximated by a heuristic analysis [14] of the 
function field sieve as 



l = B, 
B = 

d = 



(^-j iVl/3logg(iV)2/3 



N 

B+l 



Since we have restricted ourselves to superelliptic curves we need to ignore values 
of d which are divisible by three. However, in the range of our experiments this 
is not an issue and extending our results to the case where d = 0 (mod 3) can be 
accomplished using Cofa-curves [17], at the expense of more complicated formulae 
on the algebraic side. 



3.1 Selection of / 

Following Joux and Lercier [14] we first select Ui and U 2 and then find a suitable 
value of /. We set N' = N (mod d). However, since the degree N = nje may 
not be prime we split into three possible sub-cases: 
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— Case (i) : gcd{N',d) = 1. 

We choose a curve H such that R{X) is of degree b = N' and such that 
:^Ju{k) is coprime to {p^ — l)/(p — 1). If this is not possible we go to Case 
(iii) . We then select Ui and U 2 of exact degrees m — 1 and m respectively 
where m = {N — b)/d. The degree of the polynomial f{x) = U 2 H{X, —U 1 /U 2 ) 
is then given by max{d • (m — l),d ■ m + b} = N . Hence, we keep selecting 
Ml and M 2 until / is irreducible. 

— Case (ii) : W = 0. 

We select R{X) of smallest possible degree b such that :^Ju{k) is coprime 
to {p^ — 1 ) /{p — 1). We then select mi and M 2 of exact degrees m and m — 1 
where m = N/d. The degree of the polynomial f{x) = U 2 H{X, —U 1 /U 2 ) is 
then given by maxjd ■ m,d ■ (to — 1) + &} = iV, assuming deg R < d. Hence, 
we keep selecting mi and M 2 until / is irreducible. 

— Case (iii) : gcd(A^', d) yf 1, d, or no suitable curve found above. 

Here we select R{X) of degree b < d, with gcd(&, d) = 1, such that ^Jnik) 
is coprime to {p^ — l)/(p — 1) and for which one of and <2 is minimal 
where 



ti = max{iV, d ■ m,d ■ (to — 1) + &}, 
t 2 = max{iV, d ■ (to — 1), d • to + 6}. 

In the case of ti being minimal we then select Mi and M 2 of degree to and 
TO — 1 until U 2 H{X, —U 1 /U 2 ) if divisible by an irreducible polynomial / of 
degree N. In the case of t 2 being minimal we select mi and M 2 to be of degree 
TO — 1 and TO, until we obtain an irreducible polynomial / of degree N which 
divides U 2 H{X,—u\/u 2 )- 

Note that Joux and Lercier [14] only considered Case (i) above, since they used 
arbitrary Cab-curves (and not simply superelliptic ones), and because the ex- 
tension degree N was always prime. Clearly, Case (iii) will lead to marginally 
less efficient relation collection, since the degree of the algebraic side will be 
slightly higher than if one was in Case (i) or (ii). However, if one is to deal with 
non-prime values of N and superelliptic curves one is led to such a case. 

3.2 Lattice Sieving 

The finding of (r, s) is performed using a lattice sieve. In the lattice sieve one 
selects a prime polynomial p &TZ (resp. prime divisor (p, T — r) G A). Then one 
looks at the sub-lattice of the (r, s) plane such that SU 2 — rui (resp. (s -I- rY)) 
is divisible by the chosen polynomial (resp. divisor). We found it more efficient 
to then sieve in the sub-lattice on a line-by-line basis, rather than sieve in a 
two-dimensional manner in the sub-lattice. 

The use of lattice sieving has a number of advantages over sieving in the (r, s) 
plane. Firstly, it is better at yielding a large number of relations. Secondly, one 
can target factorbase elements for which one does not yet have a relation. This 
enables one to obtain a matrix involving all elements in the factorbase reasonably 
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efficiently. Thirdly, the use of lattice sieving is crucial in the final stage where 
one wishes to target individual discrete logarithms using the recursive sieving 
strategy mentioned earlier. 



3.3 Factorbase Size 

A major consideration is the balancing of the effort required for the sieving 
and the matrix step. In theory one balances the two halves of the computation 
and so derives the complexity estimates such as those above. The size of both 
factorbases is approximated by /B, however one should treat this size as a 
continuous function of a real variable B, as opposed to the discrete B above. See 
[12] for further analysis of this point. 

However, sieving can be performed in a highly scalable manner. After all, 
to move from using a single computer performing the sieving to around 100-200 
computers is relatively easy given the resources of most organizations these days. 
However, our code for the matrix step required the use of a single machine and 
hence does not scale. 

Hence, in practice we can devote less total time to the matrix step compared 
to the sieving step. Since the matrix step has complexity approximately 0{T^) 
where T = ^TZ+^A « 2p^ /B, we see that we have a physical constraint on the 
size of the factorbases we can accommodate. In our experiments we assumed that 
solving a matrix with over half a million rows and columns was infeasible given 
our resources and matrix code. This sometimes led us to choose non-optimal, 
from a theoretical perspective, values for the other parameters. 



3.4 Linear Algebra Step 

Several studies involving use of index calculus-type methods discuss the linear 
algebra step, since it is a major practical bottleneck in the procedure. This is be- 
cause parallelising existing algorithms is rather difficult. Various authors [15,20], 
have identified several effective techniques for the solution of sparse linear sys- 
tems modulo a prime. These include iterative schemes such as the Lanczos, Con- 
jugate Gradient and Wiedemann algorithms, with or without a pre-processing 
step involving structured Gaussian elimination. Recently attention has moved 
onto attempting to perform this step in parallel, see for example [25] for the case 
of Lanczos over the field F 2 . 

In our implementation we followed [15] and used a basic structured Gaussian 
elimination routine, such as that described in [3,15,20], so as to reduce the size 
of the linear system whilst maintaining a degree of sparsity. This submatrix is 
subsequently solved by the Lanczos algorithm [16], and a full result set is then 
recovered via back-substitution. Our implementation for this stage made use of 
Victor Shoup’s NTL C-|— I- library for multiprecision integers [23]. 
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4 Experiments 

As mentioned in the introduction, we are not only interested in how efficient the 
function field sieve method is on fields of relevance to IBE based systems. We 
are also interested in the effect of choosing a different base field A: = F3e in the 
function field sieve. Usually for traditional discrete logarithm systems this would 
not be an issue since the extension degree is often prime. One should note that 
a precise theoretical analysis [12] of the function field sieve method shows that 
for a fixed field size the effect of the size of the base field is not as one would 
expect. This strange effect of the base field size 3® on the overall performance 
for a fixed field size 3" can be explained due to the non-continuous behaviour of 
various parameters, in particular the function field extension degree d. 

Note, the expected run time would be operations in k[x]. Each field oper- 
ation requires at most O(e^) operations in F3 and the polynomials involved are 
of degree around 

0 = max{fV/(i, di?}. 

Hence, one would expect the run time to be proportional to 

(5eT)2. 

In the experiments below we selected field sizes g = 3” which arise from super- 
singular elliptic curves with group orders which are “almost” prime. 

4.1 Field Size 3^®® 

This corresponds to a field size of approximately 295 bits. The rough theoretical 
estimates, given above, for the various values of e = 1,2,4, 6 are in the table 
below. A more careful analysis as in [12] reveals the following estimates, where 
for the factorbases we take the first T/2 primes (resp. T/2 prime places) on the 
rational (resp. algebraic side), in other words the value of B is not used directly. 

Rough Analysis Analysis of [12] 

e d B T« d T« 



1 4 


12 


89000 


5 


85000 


2 4 


6 


180000 


4 


75000 


3 4 


4 


270000 


4 


150000 


6 3 


2 


530000 


4 


330000 



We compared these results to the yields provided by our implementation and 
found that the best possible values seemed to be given by 



e 


d 


1 

2 

3 


5 70000 
4 90000 
4 190000 





Function Field Sieve in Characteristic Three 



231 



This latter table was produced by comparing the yields of the implementation 
over a fixed time period for various parameter sizes and then selecting the one 
which would produce a full matrix in the shortest time period. We were however 
unable to generate suitable experimental data for the case e = 6 since this field 
is really too large to apply the FFS method in practice for such a value of n. 

4.2 Field Size 3^22 

This corresponds to a field size of approximately 352 bits. The rough theoretical 
estimates, given above, for the various values of e = 1, 2, 4, 6 along with the more 
precise estimates of [12], are in the following table 



e 


Rough Analysis 
d B T « 


Analysis of [12] 
d T« 


1 


4 


13 


250000 


5 


130000 


2 


4 


6 


180000 


4 


190000 


3 


4 


4 


270000 


4 


270000 


6 


4 


2 


530000 


4 


460000 



We compared these results to the yields provided by our implementation and 
found that the best possible values seemed to be given by 



e 


d 


1 

2 

3 


5 100000 
4 190000 
4 320000 



Again the values for e = 6 are not given since n is still too small for this base 
field size to apply the FFS method successfully. 

4.3 Field Size 3®»2 

We present this field since it is the first one which is usable in pairing based 
systems and which “could” be secure against current computing power on both 
the elliptic curve and the finite field sides. It corresponds to a bit length of 923 
bits. The rough theoretical estimates are given below which should be compared 
to the the analysis in [12] 



e 


Rough Analysis 
d B 


Analysis of [12] 
d T« 


1 


5 


21 


9.0- 10« 


6 


3.0- lO"' 


2 


5 


10 


7.0-10® 


7 


3.0-10® 


3 


5 


6 


1.0- 10® 


7 


2.8-10® 


6 


5 


3 


2.0- 10® 


5 


4.0-10® 



Note that these parameters would imply that such key sizes are currently out of 
range of the FFS method. 
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To completely confirm both our theoretical estimates and our partial exper- 
iments we ran a few experiments through to the final matrix stage and compu- 
tation of individual logarithms: 

Our experiments were run on a network of around 150 Unix based work- 
stations. The network contained a number of older Sparc 5s and 10s running 
Solaris, plus a large number of Linux based machines with AMD1600 or Pen- 
tium 4 processors. We only used idle cycles of the machines whilst other people 
were using them, hence during the day we had the equivalent of 50 Pentium 
4 machines working flat out. At night this increased to around 100 Pentium 4 
machines. The matrix step was run on a Sun Blade 1000 workstation. 

In the following table we present the wall clock time ti needed to produce the 
relations, the time t 2 needed to solve the matrix step (divided into the time t '2 
needed to perform the structured Gaussian elimination and the time to solve 
the reduced system). We ran the sieving clients for time U producing a total 
of R relations on elements of the factorbase. For many examples we did not 
try to find relations on all the factorbase elements, since finding the relations on 
the last few elements can take a disproportionate amount of time. The line m 
denotes the approximate row-size of the resulting (approximately square) matrix 
after the application of structured Gaussian elimination. Our times for the linear 
algebra step can be considerably improved we believe, and work is ongoing in 
this area. 



n 


186 


186 


186 


222 


222 


N 


186 


93 


62 


222 


111 


e 


1 


2 


3 


1 


2 


d 


5 


4 


4 


5 


4 


R{X) 


A 


A 


A 


X^ + 2X+l 


A3 -h A 


T 


70000 


80000 


140000 


100000 


160000 


R 


80045 


96365 


139376 


96956 


181675 


Ta 


69345 


76425 


137750 


95169 


158952 


m 


12634 


13218 


24746 


20719 


32148 




8 h 


7h 


30h 


48h 


50h 


t '2 


43m 


Ih 


2h 24m 


3h 


4h 20m 


t'i 


13h 


16h 


Id 18h 


2 d 21 h 


4d 14h 



5 Conclusion 

We have reported on the first implementation of the function field sieve in char- 
acteristic three. We have paid particular attention to the case of finite fields 
which arise in pairing based cryptosystems. In particular such fields are of a 
composite nature and we have seen that this provides at best a marginal benefit 
in allowing one to apply the function field sieve over either F 3 or F 32 . We have 
shown how the exact analysis of [ 12 ] is more able to predict the behaviour, and 
thereby parameter choices, than the naive simple analysis. We have also shown 
how the key sizes one would use in a simple pairing based system are likely to 
be secure against current algorithms and computing power. 
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Finally we hope that our work will encourage others to investigate discrete 

logarithm algorithms in composite fields of characteristic three, and thereby 

allow the community to have greater faith in the security of pairing based systems 

which are based over such fields. 
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Abstract. We give a comparison of the performance of the recently 
proposed torus-based public key cryptosystem CEILIDH, and XTR. Un- 
derpinning both systems is the mathematics of the two dimensional al- 
gebraic torus rg(Fp). However, while they both attain the same discrete 
logarithm security and each achieve a compression factor of three for all 
data transmissions, the arithmetic performed in each is fundamentally 
different. In its inception, the designers of CEILIDH were reluctant to 
claim it offers any particular advantages over XTR other than its exact 
compression and decompression technique. From both an algorithmic 
and arithmetic perspective, we develop an efficient version of CEILIDH 
and show that while it seems bound to be inherently slower than XTR, 
the difference in performance is much smaller than what one might infer 
from the original description. Also, thanks to CEILIDH’s simple group 
law, it provides a greater flexibility for applications, and may thus be 
considered a worthwhile alternative to XTR. 



1 Introduction 

From a representation perspective, basing cryptography in the multiplicative 
group of a finite field is inefficient. The reason for this is that current index 
calculus algorithms can solve discrete logarithms in finite fields approximately 
six times the size of groups to which generic algorithms apply [1,6,8,20,26]. As 
a consequence this representation is about six times less efficient in terms of 
memory and bandwidth, than the information-theoretic limit. In the future this 
ratio will continue to increase as recommended key sizes become larger, since 
algorithms to solve finite field discrete logarithms are subexponential. 

So what can one do if bandwidth, power consumption and speed are essential 
factors in the design of a cryptosystem? The most obvious answer is to use a 
system for which no known subexponential algorithm exists, such as a secure 
elliptic curve [2] or perhaps a genus two hyperelliptic curve [27]. Assuming the 
discrete logarithm problem (DLP) in these groups is as hard as is currently 
believed, these systems provide an effective solution to the issues highlighted 
above, and have been thoroughly researched since introduced in 1985 [10,15], 
and 1989 [11] respectively. 

In recent years systems exploiting the algebraic structure of finite field ex- 
tensions have been studied. Whereas in elliptic curve cryptography so called 
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‘composite’ finite fields can introduce potential weaknesses [5], for torus-based 
cryptography they present the possibility for highly compressed data transmis- 
sion, compact implementation, and significantly improved performance. 

Torus-based cryptography may be regarded as a natural extension of classical 
Diffie-Hellman and El-Gamal in a finite field Fp, where key agreement, encryption 
and signature schemes are performed in the multiplicative group F*. For any 
positive integer n, one can define an algebraic torus Tn over Fp such that over 
Fpn , this variety is isomorphic to (pin) copies of F*, where </>(n) is the dimension 
of T„. In fact T„ is nothing other than the cyclotomic subgroup of F*„ [19]. When 
Tn is ‘rational’, it is possible to embed T„ in (/)(n)-dimensional affine space, and 
thus represent every element by just (pin) elements of Fp. 

Only recently was the connection between algebraic tori and the existing 
trace-based systems LUC [25] and XTR [13] made explicit [19]. Of particular 
interest is a current conjecture about algebraic tori that if true, implies the ex- 
istence of practical cryptosystems based in F*„ for suitable n with higher data 
compression ratios than both CEILIDH and XTR. An analogous statement has 
been proven for Abelian varieties over finite fields [18], though whether these 
are at all practical is debatable. It has also been demonstrated that attempts to 
extend the trace-methods of LUC and XTR to more general symmetric function- 
based compression maps can not work [19]. Thus if one requires smaller repre- 
sentations of elements than those afforded by XTR, torus-based cryptography 
probably holds the greatest potential for doing so. 

In this paper we content ourselves with a comparison of the torus-based 
public key system CEILIDH, introduced at CRYPTO 2003 by Rubin and Sil- 
verberg [19], and its closest relative, XTR, which was introduced at CRYPTO 
2000 by Lenstra and Verheul [13]. Based on current understanding, both sys- 
tems attain the security of F*e while transmitting only two elements of Fp in 
communications. The compression factor of three is possible since the underly- 
ing sets of each possess a birational embedding into A^(Fp), the 2-dimensional 
affine space. For implementation the basic difference between the two is that 
while XTR essentially performs all arithmetic in Fp 2 using a third order recur- 
rence, the group operation in CEILIDH is performed in F*e, with compression 
and decompression effected by the birational map to A^. In this sense CEILIDH 
may be viewed purely as a compression mechanism attached to the usual finite 
field arithmetic. Using a rather simplistic analysis, it would seem that since the 
elements of CEILIDH needed for general group operations are three times the 
size of those in XTR, its performance is bound to be inferior. 

Our main contribution is a new representation of the cyclotomic subgroup of 
F*6 which permits near-optimal multiplication efficiency. Together with another 
squaring-efficient representation of the same subgroup [22], and some arithmetic 
refinements of the birational maps consistuting the CEILIDH compression mech- 
anism, we considerably develop the basic method of CEILIDH. In contrast to 
what one might naively expect from the original exposition, our implementation 
performs nearly as efficiently as the fastest known version of XTR. We hope the 
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authors of [19] do not object if for simplicity we refer throughout the paper to 
our improvements to CEILIDH, by the same name as well. 

The authors are grateful to F. Vercauteren for many fruitful discussions and 
for his insightful comments, and would like to thank an anonymous referee for 
their helpful suggestions. 



2 Overview 

In this section we give a brief description of algebraic tori and the mathematics 
underlying CEILIDH and XTR. Further details can be found in [19], and a good 
reference for algebraic tori is [28]. In the following, we take as our base field Fp, 
which we use in our implementation, though any finite field would suffice. 



2.1 The Torus T„(Fp) 

Throughout, let Fp be the prime field consisting of p elements. Let 4> be the 
Euler 0-function, and let be the n-th cyclotomic polynomial. We write Gp^n 
for the subgroup of F*„ of order ‘Pn{p), and let A”(Fp) denote n-dimensional 
affine space over Fp, i.e., the variety whose points lie in Fp. 

For cryptographic purposes, when using finite extensions of prime fields, the 
cyclotomic subgroup plays a prominent role. Given current index calculus algo- 
rithms, to attain the full security of F*„ one should make sure that the chosen 
subgroup G can not be embedded in any proper subfield of Fpn . One should also 
ensure that no information can be obtained regarding the discrete logarithm of 
an element by a simple application of the norm function, i.e., the norm of a 
generator should be one with respect to every intermediate subfield. These two 
conditions are in fact equivalent, and define the cyclotomic subgroup Gp^„ [12, 
19]. Fortunately for cryptography, the second condition coincides with the def- 
inition of the algebraic torus T„(Fp), which we now introduce. We do not give 
the complete algebraic definition, but for our purposes this is sufficient. 

Definition 1. Let k = ¥p and L = Fpn. Define the torus T„ to be the intersec- 
tion of the kernels of the norm maps for all subfields k C F C L: 

Tn{k):= f| KevlNp^p]. (1) 

fcCFCL 

The dimension of T„ is Since T„(Fp) is a subgroup of F*„, the group 

operation is just ordinary multiplication in the larger field. The following lemma 
provides some essential properties of T„ [19]. 

Lemma 1. 1. T„(Fp) = Gp_„. 

g. #T„(Fp) = <?„(p). 

3. If h € T„(Fp) is an element of prime order not dividing n, then h does not 
lie in a proper subfield o/Fpn/Fp. 
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When T„ is rational, elements can be represented by just 4>(n) elements of Fp, and 
hence a compression factor of n/(j){n) is achieved over the usual representation. 
Tn is known to be rational when n is either a prime power, or is a product of 
two prime powers, and it is conjectured to be rational for all n. For current key 
size recommendations, this would have interesting cryptographic applications 
for n = 30 and n = 210, which would give compression ratios of 3| and 4| 
respectively, assuming good parameters can be found efficiently. This means for 
instance that one could achieve 1024-bit security while transmitting 274 or 235 
bits respectively in communications. Both CEILIDH and XTR currently require 
342 bits in transmissions for the same security, which provides a significant 
improvement over ordinary RSA for example in terms of bandwidth. 

2.2 CEILIDH 

Much of this section is a slightly abridged version of the original description of 
CEILIDH [19]. We point out that the construction given there for the rational- 
ity of Tq also provides an explicit rational parametrisation for the torus T 2 en 
passant. We take advantage of this in our algorithms in Section 3.3. 

Fix X G Fp 2 \Fp, so Fp 2 = Fp(a;), and let {oi, « 2 , as} be a basis for F^a over Fp. 
Then {a\,a2, as, xai,xa2,xa^} is a basis for Fp6 over Fp. Let a G Gal(Fp6 /Fp) 
be the element of order two. Define f/o : A^(Fp) Fp6 by 

V'o(ui,U2,'Us) = ^ ;rw, 

7 -I- a{x) 

where 7 = u\ai + U2Q.2 + Then g/p 3('!/o(u)) = 1 for every u = 

(ui, U2, U3). Let [/ = {u G A® : NF^6/Fp2 (V'o(u)) = 1 }. By (1), V’o(u) G T6(Fp) 
if and only if u G U, so restricting to U gives a morphism •0o : U — >■ Tq. It 
follows from Hilbert’s Theorem 90 that every element of Tg(Fp) \ {1} is in the 
image of ipo, and so tpo defines an isomorphism 

The equation defining [/ is a quadratic hypersurface in ui,U2,us. Fix a point 
a = (ai,a2,a3) G U(Fp). By adjusting the basis {01,02, as} ofFpe if necessary, 
one can assume without loss of generality that the tangent plane at a to the 
surface U is just the plane ui = ai. If (ui, U2) G Fp x Fp, then the intersection of 
U with the line a-|-t(l, v\, V2) consists of two points, namely a and a point of the 
form a -I- where f{vi,V2) G Fp[ui,U2] is an explicit polynomial 

independent of p. The map that takes (vi,V 2 ) to the latter point is a birational 
isomorphism 

g:A^\V{f)^U\{a}, 

where V{f) denotes the subvariety of defined by f{vi,V 2 ) = 0. Thus ipo ° 9 
defines an isomorphism 

1 /,: A2\E(/) ML^T6\{l,^o(a)}. 
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For the inverse isomorphism, suppose that (} = j3\ + f32X £ T6(Fp) \ {l,^/;o(a)} 
with /3i,/ 32 G Fp3. One can check that /?2 ^ 0 , and if 7 = (1 + /3i)//32, then 
7/^(7) = /3- Write (I + /3 i)// 32 = uioi + U2G2 + with ut G Fp, and define 



P(/3) = 



/ U2 - Q2 
\ui - ai 



U 3 - 03 
ui — ai 



Then p : T6(Fp) \ {1, V’o(a)} A^(Fp) \ V (/) is the inverse isomorphism of f/', 

and hence we have an efficient compression and decompression mechanism for 
all (bar two) elements of T6(Fp). 



2.3 XTR 

In common with CEILIDH, XTR is based on the cyclotomic subgroup Gp g- Let 
g be a generator for this subgroup. In XTR elements of (g) are represented by 
their trace over Fp2 

^’"Fp6/Fp2 (ff) = 5 + +5^ GFp2, 

and hence need only two elements of Fp to specify. The set of traces constitute 
the XTR ‘group’. Clearly, 

^%6/Fp2(ff) = ^^Fp6/Fp2(5^ ) = ^%e/Fp2(ff^ )> 

and so given an element in the XTR group one can not distinguish between 

..24 

g and its conjugates g'^ and g^ . Hence decompression to Fpe is not unique, 
though this can easily be resolved. The analogue of the DLP is to compute n 
given Trr^g/r ^2 (s) ^FF^e/Fp2 (ff")- One can convert this to an ordinary DLP 
by mapping both back to F*g by finding the correct root of 

{g)X^ + /f ,2 {gfX - I = {X - g){X - gP^){X - /), 

and similarly for Trp g/p 2(5”)- The real benefit of the XTR representation is 
the ease with which arithmetic can be performed. Let c„ = Tr^^g/jr^^ (l/”)- To 
compute c„ given ci, one uses some properties of third order addition chains 
over Fp2 applied to the recurrence 

Cu + V CuCy CyCy — y T Cy — 2y 

As a consequence, exponentiations can be performed faster than with an optimal 
representation of FpS [23,22]. The main drawback of XTR is that one can not 
perform straightforward multiplication since the set of traces is not a group. 
By keeping track of the correct conjugate however, this can be accomplished. 
Rather than being based on the torus Tq, XTR may be viewed algebraically 
as a quotient of Tg by an action of the symmetric group S 3 [19]. What makes 
XTR possible is that the trace map from this quotient variety to Fp2 provides 
an explicit rational parametrisation. For various algorithmic improvements and 
implementation tricks for XTR we refer the reader to [23] . 
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3 Efficient Representations and Algorithms for Te(Fp) 

In this section we develop suitable field representations and algorithms for the 
efficient implementation of the CEILIDH cryptosystem. We base our implemen- 
tation on Example 11 of [19], since this allows key generation to be done exactly 
as in XTR, and so a fair comparison can be made. In addition to our new 
multiplication-efficient representation, we make some basic observations that re- 
sult in a considerable improvement over the original algorithms. 

Depending on the particular protocol one wishes to implement, and even on 
which part of the protocol, some representations of the underlying field F^e may 
perform better than others. For example, in the first stage of a Diffie-Hellman 
key agreement, both Alice and Bob exponentiate a fixed public base g, so here 
one can precompute some powers of g, and one should use a field representation 
that permits fast multiplication. In the second stage, both parties exponentiate 
a random element of F*e , and so here one should use a representation which per- 
mits fast squaring. As we show, optimising each consideration leads to different 
field representations, whilst switching between them is a simple matter. 

For the two operations of exponentiating a fixed and a random base, two 
field representations suffice. The first is a degree six extension of Fp and allows 
us to use many implementation tricks [ 22 ], which include very fast squaring in 
the cyclotomic subgroup Gp^e- We refer to this representation as Fi. 

The second representation F 2 fulfills two functions: built as a quadratic ex- 
tension of a cubic extension of Fp, it firstly permits the efficient use of the 
birational maps tp and p which are essential to CEILIDH, given our cheap in- 
version method; but primarily, it provides the basis for arithmetic in F 3 . What 
we refer to as F 3 is a semi-compressed fractional form of the torus Tq, which is 
actually just the torus T 2 . Its utility is that one can perform the group operation, 
but with much better multiplication efficiency. Together, Fi, F 2 , F 3 , and the 
isomorphisms between them constitute our implementation of CEILIDH, which 
may be depicted as follows: 

Fi < . " . ^ F2 < . " F3 < -. A2(Fp). (2) 

CT-i r-1 Ip 

For Fi we give a brief description of the arithmetic, and for F 2 and F 3 we give 
full details of all operations. In Lemma 3 we provide a simple cost analysis where 
M ,A, and I represent the cost of an Fp multiplication, addition, and inversion 
respectively. In Fp we assume that a subtraction amounts to the same as an 
addition, and also that squaring costs the same as a multiplication, since the 
former operation is seldom used. 

For (n,p) = 1, let denote a primitive n-th root of unity mod p, and as in 
XTR let p = 2 mod 9 throughout (p = 5 mod 9 is equally valid) . 

3.1 The Representation Fi 

A full derivation of the results of this section can be found in [22]. Let z = (pg, 
so that Fp 6 = Fp(z), and let our basis for Fp 6 be {z, , z^}. Using a 
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Karatsuba-type method for multiplication and squaring, these can be performed 
in 18M+53A and 12M+42A respectively. However, for the cyclotomic subgroup 
Gpfi in which we work, some improvements can be made. For example, if g G Gp^ 
inversion is just the cube of the Frobenius endomorphism, since *Pe{p) = — 

p + l)|(p^ + 1), and so g~^ = gP . Also the condition = 1 gives a set of 

six equations on the six coefficients of g which enable squaring to be performed 
with a cost of 6M + 21A. 

3.2 The Representation F 2 

Let X = and p = Cg + C^^- Then FpS = Fp(p), and Fp6 = Fp3(x). The bases we 
use are {l,y,p^ — 2} for FpS, and {l,x} for the degree two extension. We now 
describe the basic arithmetic in each of these extensions. 

Fp3 Frobenius : 

For our basis, since p = 2 mod 9, the Frobenius map gives yP = — 2, and 

— 2)P = —y — (y^ — 2). Hence for a = oq + a\y + a2(y^ — 2), = oq — 02y + 

(oi - a2)(y^ - 2). 

Fp3 Multiplication : 

Let a = Oq + Oiy + 02(y^ — 2), b = bo + biy + b2{y^ — 2). Then ab = (ao^o + 
2 a±bi + 202^2 ~ ~ 02^1) + (no^i + oi^o + ni ^2 + 02^1 ~ n 2 ^ 2 )y + (00^2 + 

0260 + ciibi — a2&2)(y^ — 2). Precompute too = aobo, tii = aibi, ^22 = 02^2, and 
^01 = (no + di)(bo + bi), ti2 = (ai — a2)(&2 — ^1), ^20 = (02 — no)(^o ~ ^2)- Then 
ab = (too + ^11 + ^22 — ti2) + (toi + ti2 — too)y + (t2o + too + fii)(y^ ~ 2). 

Fp3 Inversion : 

Usually, to invert an element in an extension of a prime field one must either use 
a basic GCD algorithm, or one of many suggestions based on exponentiating to 
a power one less than the group order [7]. However, since the extension degree 
is small, we can perform inversion directly: one uses the multiplication formula 
and sets the result to the identity, i.e., one solves 

( Go 2ai — 02 2o2 — oi\ f bo\ f 

oi 00 + 02 oi - 02 M'l ^ ® 

02 Oi Oo - 02 / \ 62 / V 0 / 

for b. This gives 

f bo\ /-0§ + of + o| - 0i02\ 

&i = 3 al + oooi - 2 oi 02 
y &2 / \ —O? + O2 + Oo02 / 

where Z\ = — Oq + of + o| + SoqoI + 3ooof + 3oi02 — 6ofo2 — 3oo0i02. Computing 

^00 = Oq> til = Ol, t22 = oi; ^01 = OqOi, ti2 = Oi02, ^20 = 02 0Q, toi2 = 0o + 0i+02 

and t = ti2(oo + oi), then A = ~ foo(3toi2 — oq) — 9f. To finish, we perform 

one Fp inversion and obtain a~^ equals 

^((tll + t22 — too — tl2) + (toi ~ 2ti2 + t22)y + (^20 + ^22 ~ fll)(y^ — 2 )). 
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Fp6 Frobenius : 

Let c = Co + cix. Then c^ = (cq + cix)^ = (c^ — c^) — c^x. 

Fp6 Multiplication : 

For c = co + cix, d = do + dix, we have cd = (codo — cidi) + {codi + cido — cidi)x. 
If we compute too = codo, tii = cidi, and toi = (co + ci)(<io + <^i)) then cd = 
{too ~ ^ii) + (^01 ~ toQ — 2tii)x. 

Fp6 Squaring : 

= (co + cia;)^ = (cg — cf) + Ci(2co — Ci)x. We compute toi = (co + Ci)(co — Ci), 
giving = toi + ci(2co - ci)x. 



Fp6 Inversion : 

Performing again a direct inversion as in F^s , we find 



do + d\x = (co + cix) ^ 



T ^ — if 

Cq - CoCi + Cf V -Cl J 



SO that we still only require one Fp inversion. Precomputing to = {c\ — co), 
^01 = cqCi, and A = + ^oi> the coefficients of the inverse d are given by 

do — A ^toi ^Ci- 

If we are working in Gp^, then as in Fi inversions are essentially free thanks 
to the cheap Frobenius endomorphism. 



(j ; F 1 — y F 2 ^ 

In addition to the individual arithmetic of Fi and F 2 we need to specify an 
efficiently computable isomorphism between them. Writing x and y in terms of 
z we find that x = z^, and y = z — — z^, and so cr“^ : F 2 — >■ Fi can be 

evaluated with just a few additions: 

/ 0 1 -1 0 0 1\ 

0-1 1010 
_i _ -1 0 0 100 

“ 0 0-1010' 

0-1 0001 
\-l 0 0000/ 

Since a~^ has determinant three, a naive evaluation of cr necessitates four divi- 
sions by three. It is not possible to eliminate all of these since for our Fi, writing 
Fp6 as a quadratic extension of a cubic extension, all bases have determinant 
divisible by three. We can reduce this to just one division by three however, (or 
a multiplication by its precomputed inverse) by writing 



/I 


0 


0 


0 


0 


0 


0 


1 - 


-1 


0 - 


-1 


1 


0 


0 


1 


0 


1 - 


-1 


0 


0 


0 


1 


0 


0 


0 


0 


0 


0 


1 - 


-1 


VO 


0 


0 


0 


0 


1 



/OOO 0 0-l\ 
110 - 1-1 0 
0 0 0 -1 0 0 
0 0 1 0 0 -1 
110 000 
5 0/ 



a = 
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3.3 The Representation 



In our notation, the group operation in CEILIDH as originally described is per- 
formed in F 2 [19], and the inverse birational maps ip : A^(Fp) \ V{f) — > 
'76(lFp) \ and p : T6(®’p) \ A2(Fp) \V{f), are given by 



tp{ai,a2) 



1 -I- aiy + 02(2/^ — 2) -I- (1 — — 02 + Q!ia2)a^ 

1 -I- aiy + 02(2/^ — 2) + {I — a\ — a2 + 0102)2;^ ’ 



(3) 



where V{f) is the set of zeros of /(oi, 02 ) = 1 — af — + ai «2 = 0 in A^(Fp); 

and for (3 = j3\ + P 2 X, with /3i, /32 G Fp3, let (1 -I- /3i)//32 = ui +U 2 y + Uz{y'^ — 2). 
Then p(/3) = (M 2 /M 1 , ms/mi). 

Since the torus Tg is two-dimensional, given compressed points Pi = (oi, 02 ) 
and P 2 = iPi, P 2 ), it would be aesthetically appealing to compute their composi- 
tion directly without having to map the affine representation back to FpS, ie., to 
find the (71,72) G A 2 such that (o;i,a2)o(/3i,/d2) = (7ij72)) where o refers to an 
operation equivalent to multiplication in Fp6 . The drawback with this approach 
is that the group law on Tg as we have presented it is defined only in terms of the 
arithmetic of Fp6 , and so the decompression of Pi and P 2 before multiplication 
seems essential. 

If one insists on representing intermediate results in their compressed form 
in an exponentiation for example, we found that this costs 24M -|- 43A -|- I 
for a multiplication, and 21M -|- 38A -|- / for a squaring. Note that these are 
even more costly than both a general elliptic curve add, or double for example. 
The reason these operations are so expensive is that the rational representation 
of Tg does not lend itself favourably to performing the group law operation: 
this statement does not hold for the XTR representation, which is why the 
corresponding arithmetic is so much faster (cf. Figure 1). 

The arithmetic developed in the previous section shows that if one alterna- 
tively performs a one-time decompression at the start of, and a re-compression 
at the end of an exponentiation, then the cost of a general multiplication and 
squaring is only 18M -|- 54A and 12M -|- 33A respectively in P 2 , and 18M -|- 53 A 
and 6M -|- 21 A respectively in Pi. Clearly decompressing to F^e seems to be the 
better method. 

This is not the whole story though. Since Tg(Fp) is by definition the inter- 
section of the kernels of the two norm maps and , if we have 

a good representation for either of these, we can save some work. The following 
lemma emphasises the relevance to CEILIDH of the parametrisation of elements 
in the kernel of the former norm map, which is implicit in the construction of 
■ 4 ). 



Lemma 2. There is an isomorphism 



T : Ker[fVr^3/F^3] ^ | & G F^a | U {!}, 
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where for a = oq + a\x € Ker[A^jr^g/]f^ 3 ] \ {!}, 



r(a) = 



/ b + X 
\6 + 



with 6 = (1 + ao)/ai if ai ^ 0 and b = ao/(ao — 1) otherwise, and 



r-\b) 



62-1 ^ 26 - 1 ^ 
62-6 + 1 62-6 + 



Proof. For a G F^e, (a) = and so the kernel of this map has 

at most 1 + solutions. For each (6 + x)/(6 + x^),b G F^a, the stated norm 
is one, and counting also the identity, we have all 1 + solutions. Solving 
oo + aia; = (6 + x)/(6 + x“^) for 6 gives the second part, while for the third, 
we solve the same equation for oo,ai, where we have used the condition that 
Oq + of — aooi = 1 <+> oo + oicc G Ker[iV]F^(,/][r^3]. 

□ 

With this we can introduce the following: 

Definition 2. +3 is the set of elements 



ao + aix 
Oq + aia;2 



, Gj G Fp3 



When the coefficient a\ of this representation equals 1, we say the element is 
reduced. 



Note that if we do not need the reduced form of an element in + 3 , then evaluating 
T simplifies to (l + ao+aia;)/(l+ao+aia;2), so no inversion is necessary. Mapping 
an unreduced element back to F 2 , we obtain 

-I ( ao + aicc 
\ao + aia;2 

As we have pointed out, this construction is no more than the rational parametri- 
sation of T 2 (Fp 3 ). As a result of the symmetry of x and x^ in the irreducible 
polynomial defining the quadratic extension of + 2 , for all arithmetic operations 
between elements of + 3 , the coefficients of the numerator and those of the de- 
nominator correspond exactly. Hence we need only work with the numerator, 
which will in general have the form a = a\ + 02 X. For expositional purposes, in 
the following we still write elements of Tq as fractions but for our implementation 
this is of course unnecessary. 

Using this fractional form alone does not seem to offer any advantage over 
the +2 representation. However, considering the exponentiation of a reduced 
element {g + x) / {g + x'^) of Tq, we already save one Fp 3 multiplication for every 
multiplication by this element, since 

/ g + a: \ ^ / Gq + Gia: \ _ / (gop - Oi) + (goi + gq - ai)x \ 

\g + x'^J \ao + aix'^J \ (gGp - gi) + (gGi + op - Gi)x2y ’ 



Gq — of ^ 2 gpGi — of 
Oq — OpOl + of Og — OpOl + of 
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SO we only need to compute gao and gai, and a few additions, when our multiplier 
is in this form. 

Furthermore, we can exploit the fractional form of elements of F3 for squaring 
and inversion, and if additions are sufficiently cheap compared to multiplications, 
we can actually reduce the cost of a basic multiplication to the theoretical 
minimum measured in the number of Fp multiplications, using a Toom-Cook- 
style interpolation [ 9 ] . We do not have space to detail this last point here unfor- 
tunately. 

The reason this all works is that the rational representation of elements of 
T2 can be embedded efficiently as a fraction in the field extension. Noting that 
we need only work with the numerator, the group law can be performed directly 
on this compressed element. 

The reduced form of an element may be viewed as the affine representation of 
T2, with the non-compressible identity element being the point at infinity, with 
the non-reduced form of an element corresponding to the projective representa- 
tion, with identity A for any A G F*3. This point was essentially made in [ 19 ], 
but without reference to its applicability to CEILIDH as well. There however, it 
was suggested the group law be performed entirely in FpS, which would require 
an F*3 inversion for every multiplication: this would be very inefficient. 

In fact the idea of using the torus T2, as applied to cryptography, is not new. 
It has been suggested by other authors that elements of T2(Fp) be represented 
in the (group-theoretically) isomorphic set F*2/F*. If x = (3, then elements 
are represented as oq -I- aix, and oq -I- aix = bo + b\x iff ao/oi = 60/61. As a 
consequence, elements for which ai yf 0 can be represented by oo/ai, and all the 
arithmetic above follows mutatis mutandis. 

F3 Ftobenius : 

Let a € F3. Then 

/ oo -I- aix V _ f dQ + a^x^ \ — f (^0 ~ ®i) ~ ^ T ~ ^0) + \ 

\ao + aix^J \ Og + afx J \(ag — a^) — a^x^ J \(a^ — Uq) + a^x^ J 

F3 Multiplication : 

Multiplication by a reduced element is performed as in ( 4 ), or if by a non-reduced 
element, exactly as in F2. 

F3 Squaring : 

This is performed as in F2. 

F3 Inversion : 

This is straightforward, since elements are represented as fractions. 

/ Oo -I- aix \ ^ f ao + aix^\ _ / (oi — oq) -I- aix 

\ao + aiX^J \ ag + aix J \(ai — ag) + aix^ 

Also, since we use the intermediate representation Fg between A^(Fp) and 
F2, we must adjust the map p : Fg \ {l,x^} — ^ A^(Fp) \ V{f). Let (3 = 
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(/3i + l^2x)l{(ii + P2X^) G F3, with /3 i //?2 =ui+ U2y + ^ 3 ( 2 /^ - 2): then p(/3) = 
(U2/M1, U3/M1). We summarise the results regarding arithmetic in Fi,F 2 and F 3 
in the following: 

Lemma 3. The cost of arithmetical operations in Fi,F 2 and F 3 are: 



Operation 


Fi 


F 2 


Fa 


Multiply 

Square 

Inverse 

Frobenius 

Reduce 

Mixed Mul. 


ISM + 53A 
6M + 21A 
2A 
lA 
n/a 
n/a 


ISM -t 54A 
12M + 33A 
6A 
lOA 
n/a 
n/a 


ISM + 54A 
12M -b 33A 
3A 
lOA 

19M -t 35A -t I 
12M + 33A 



Map 


Cost 


Fi — > F2 
F2 — >■ Fa 
Fa 

P/ Fa 

Fa — >■ F 2 
F 2 — > Fi 


IM + llA 
lA 

14M + 19A + I 
2M + 3A 
25M + 41A + 7 
8A 



Here the operation Reduce refers to obtaining the reduced form of an element of 
F3, and a Mixed Mul. refers to multiplying a non-reduced element by a reduced 
one. The cost of the map p : F3 — >• A2 assumes the element being compressed is 
in non-reduced form, as this is the case after an exponentiation in both F\, or 
F3. Also, for the map : F3 — >■ T2 we assume that the x-coefficient is in Fp 
only as in (3), and not Fp3, as in practice one would only perform this operation 
when decompressing from A2 to F\, and not from a non-reduced element. 

3.4 Exponentiation 

For F\, F 2 and F3 one can use the Frobenius map to obtain fast exponentiation. 
In a subgroup of order I where l\{p^ ~ P F 1), we write an exponent m as m = 
nil + m2P mod I, where mi, m2 are approximately half the bitlength of m [22]. 
One can find toi and m 2 very quickly having performed a one-time Gaussian two 
dimensional lattice basis reduction, and using this basis to find the closest vector 
to (to, 0)^. To compute o'" for a random a, we perform a double exponentiation 
using the Joint Sparse Form (JSF) of the integers mi, m 2 [21] and 
Shamir’s trick, originally due to Straus [24], which on average halves the number 
of pairs of non-zero bits in their paired binary expansion. The use of the JSF is 
possible since we have virtually free inversion. 

When the base of the exponentiation is fixed we also use the JSF and Shamir’s 
trick but perform some precomputation as well. Fixed elements are important 
since we can spend some time and space to precompute values that might ac- 
celerate an operation yet be reused often enough to make the cost of doing so 
acceptable. We store for i,j G {0, ±1}, and k from 1 to half the bit- 

length of I, where I is the size of the subgroup we work with. For a 1024-bit field, 
and with I approximately of length 160 bits, for the price of storing 4 x 80 = 320 
fields elements, we eliminate all squarings from the exponentiation routine. In 
F3 the storage of 320 reduced elements requires about 22.5Kb, whereas for Fi 
and F 2 this amount is doubled to 45Kb, since elements can not be reduced. In 
XTR only one element is precomputed (it is unclear how to exploit more pre- 
computation). This provides nearly as good a speed up as for CEILIDH, but 
here the cost of the precomputation is much cheaper in both time and space. 
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For cryptosystems where the group law is just ordinary multiplication in a finite 
field though, one can always exchange space for time, so CEILIDH has a slight 
advantage here over XTR, if these resources are available. We concede that more 
efficient precomputation methods can be applied given our chosen level of stor- 
age [14], but are confident our method provides an accurate reflection of the 
possible gains resulting from this approach. 

For double exponentiation we assume that both bases are random, so that no 
precomputation can be employed. Using the JSF for each exponent, we combined 
the squarings for both exponentiations while performing the multiplications sep- 
arately. It is possible to make this slightly more efficient [17], and to use some 
precomputation, but again we are satisfied that our results are indicative of the 
general performance of the algorithms. 

4 Implementation Results 

To demonstrate the different performance characteristics of the three representa- 
tions of CEILIDH, we constructed an implementation of the entire system, based 
on the algorithms described in the previous section. We also implemented the 
fastest algorithms for the equivalent XTR protocols [23] , so that our comparison 
was made between the best possible implementations of both systems. 

We based this implementation on a special purpose library for arithmetic 
in Fp that represents and manipulates field elements using Montgomery re- 
duction [16]. Montgomery arithmetic facilitates fast field operations given some 
modulus specific precomputation [3]. This is ideal for our purposes since after 
key-generation the field Fp remains the same for all subsequent operations. 

We used a GCC 3.3 compiler suite to build our implementation and ran 
timing experiments on a Linux based PC incorporating a 2.80 GHz Intel Pen- 
tium 4 processor. The entire system was constructed in C-|— I- except for small 
assembly language fragments to accelerate operations in Fp. We accept that fur- 
ther performance improvements could be made through aggressive profiling and 
optimisation but are confident our results are representative of the underlying 
algorithms and allow a comparison between them. 

For our experiments we randomly chose 500 key pairs {p, 1) with the field 
characterisitc p of length 176 bits and subgroup size I 160 bits. These parameters 
heuristically provide the equivalent of 1024 bit RSA security. With the same key 
pairs for both CEILIDH and XTR we performed 500 instances of each operation 
listed in Figure 4. For exponentiations, exponents were chosen randomly modulo 
I in all cases. 

The left table shows timings for operations pertinent to use in real cryptosys- 
tems. We use xr to denote a random element in a given representation and xp 
to represent a fixed element. Thus, represents a single exponentiation with 
a random base and exponent while x^.y'^ represents a double exponentiation 
without precomputation. 

These timings are useful since they form the basis of all public key crypto- 
graphic protocols, and so we have a good idea of the comparative performance 
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Operation 


Time | 




Fi 


F 2 


Fs 


XTR 


xr ■ yn 


CO 

bo 


41.2 fis 


41.8 jas 


8.7 /rs 


x% 


16.8 p,s 


26.9 p,s 


27.7 p.s 


4.9 /iS 


Reduce 


n/a 


n/a 


170.6 ns 


n/a 


Mixed Mul. 


n/a 


n/a 


28.4 /is 


n/a 


Xr 


2.99 ms 


3.94 ms 


3.94 ms 


2.57 ms 


Xr-Vr 


4.71 ms 


5.75 ms 


5.79 ms 


2.98 ms 


Xp 


1.56 ms 


1.74 ms 


1.21 ms 


1.49 ms 


Precomp. 


5.89 ms 


9.33 ms 


63.65 ms 


1.43 ms 



Mapping 


Time 


Fi F 2 

F 2 -F3 
F 3 -> 


7.1 ps 
1.9 ps 
161.7 ns 


Fi <— F 2 

F 2 <r- F 3 

F 3 ^ 


2.4 ns 
222.8 ns 
5.1 ns 



Fig. 1. Timing Results 



of the different representations of T 6 (®’p)) and XTR. One can see that the results 
are in general agreement with our arithmetic cost analysis. Indeed F\ provides 
the most efficient representation when exponentiating a random element, and 
with precomputation, offers a slight improvement over XTR, allowing for the 
cost of p : F 3 — >• as well. 

The right table in Figure 4 demonstrates the cost of applying mapping op- 
erations on elements to transform them between our different representations. 
These results represent the time taken to map a random element in the source 
representation and transform it into the corresponding element in the target 
representation. One notes from these that it is unfortunately not advantageous 
to ‘mix’ representations, so that in a single exponentiation squaring is performed 
in F\, and multiplication in F 3 . This would be analogous to the use of mixed 
coordinate systems in elliptic curve cryptography [4]. 

The authors note that it is possible to negate the necessity of Fi altogether by 
basing all arithmetic in a field of even characteristic, since then squarings are just 
as inexpensive in F 3 . With suitable keys, this would simplify implementations 
of CEILIDH and far more efficient precomputation strategies could be brought 
to bare. 
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Abstract. We introduce a new model for elliptic curves over rings of 
odd characteristic, and study its properties and its utility in numerical 
computations. They turn to be particularly interesting for elliptic curves 
with complex multiplication, for which they provide very simple stable 
equations. The invariants associated to these models allow an easy con- 
struction of ring class fields of certain imaginary quadratic orders, with 
interesting theoretical consequences and practical utility in numerical 
computations. 

1 Introduction 

We introduce a new model for elliptic curves over rings of odd characteristic. 
The symmetry of this model gives good reduction properties to them: they are 
stable curves at any odd prime. As a consequence, the discriminants of these 
models are quite small, since they are only divisible by a relatively low power of 
2. The aim of this paper is the study of these models, both from the theoretical 
and the practical viewpoint. 

We show the reduction properties of our model in section 2, which ends 
up with a table of elliptic curves with complex multiplication by imaginary 
quadratic orders with class number 1 and 2 and good reduction outside 2. The 
invariants related to our model can be seen as certain modular functions, which 
are described in section 3. In section 4 we describe an easy construction of certain 
ring class fields based on the use of the discriminant of these models. The class 
polynomials originated by these discriminants are then analyzed in section 5. 
Finally, in section 6 we discuss the possibility of using the new invariants for the 
computation of Hilbert class polynomials. 
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2 Symmetric Models of Elliptic Curves 

Let A be an integral domain, with field of fractions K of odd characteristic. For 
G € A we consider the elliptic curve 



Ec = X^ + GX^ + X. 



( 1 ) 



We shall call such an equation for an elliptic curve a symmetric model; the word 
symmetry refers both to the coefficients of the polynomial X^ + GX^ + X and 
to the role of its non-zero roots £ 1,(2 which satisfy £ 1^2 = 1- 
The discriminant and the j-invariant of Eq are: 



S = 16(G2 -4), 



256(G2 -3)3 (S -b 16)3 
G2 -4 “ 3 



(2) 



The first expression of j in terms of G shows that G is integral over 
since it satisfies the sextic equation 



G® - 9G^ - (^-27 + G^ -b - 27 = 0. (3) 

Moreover, this expression shows that every elliptic curve E over a ring A of odd 
characteristic admits a symmetric model Eg over an algebraic extension of A: it 
can be obtained from the j-invariant of the curve, solving equation (3). In general 
we will find six symmetric models Eq not defined over A, but over a triquadratic 
extension of A. If one needs a model defined over a smaller extension of A, the 
following quadratic twist of Eq can be used: 



= X^ + G^X + G^X = Jf3 



£>-b64 

16 



^2 -b 



D + 64 
16 



X. 



The discriminant of this model is S' = 256G®(G^ — 4) = S(S -b 64)3/256. 
The expression of j in terms of S provides the relation: 



®3 + 48^2 (708 _ = 0 , 



(4) 



which shows that S is a unit in A [4] [j], and suggest that the curve Eq should 
have very nice reduction properties outside 2. The following result confirms this 
point: 



Proposition 1. 

a) Eg is an stable curve over A[1/2][G]. 

h) The odd factor of the conductor of Eg is = rip|G ±2 P- 

c) For an odd prime p | G -b 2, the curve has Eg split multiplicative reduction. 

d) For an odd prime p | G — 2 the curve has Eg multiplicative reduction. The 
reduction is split if —I € IF*^ and non-split otherwise. 
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Proof. For an odd prime p dividing G ± 2 the reduction of Eg mod p is 

zp + X = X{X T l)^ 

which has an evident node. The tangent lines of this node are the linear factors 
of ± (x=F 1)^. ■ 

In fact, we can say more about the reduction properties of the symmetric 
models. 

Theorem 1. Let E be an elliptie eurve over A. Let L he the field of fractions 
of A[G] and Ol its integer ring. 

a) Lf an odd prime p G SpecO^ satisfies fp(G) <0 (or equivalently fp(S) < Oj 
then E has potentially multiplicative reduction at p. 

b) Lf j{E) G A, the symmetric models Eg are integral minimal models of E 
over T[i][G]. 



Proof. If Vp{D) < 0, then also Vp{j{E)) = ) < 0 so that E cannot have 

potentially good reduction. Hence, when j{E) G Gl we will have Vp{G) > 0 for 
every odd prime p G SpecH. In fact, equation (4) shows that the norm 
is a power of two, and hence Wp(®) = 0. ■ 

Similar results can be proved for the Legendre models 1)(X — A). 

The advantage of symmetric models is that their invariants are simpler than the 
invariants associated to the Legendre models. We have already seen that G 
satisfies a triquadratic equation over A[j], while the equation relating A and j is 
a general sextic. The discriminant L\(A) of the Legendre model satisfies 

2^M(A)^ + (-196608 + 1536 j - f) A{\f + 2^3 (334 + j) Z\(A) - 2^4 = 0, 
which is clearly worse than (4). 

We finish this section with a table of symmetric models of the elliptic curves 
Eg with complex multiplication by an imaginary quadratic order of class number 
1. We also include a number of examples for class number 2 quadratic orders. 
All these models have good reduction outside the primes dividing 2 in its field 
of definition. 



D 


j 




Eg-.T^ = f[X) 


\ 1 


-3 


0 


- 2 "* 


+ ^Aix‘^ + X 


-4 




-64 


x-^ + x 


-7 


- 3 ^ 53 


-1 


X3 + ^x^ + X 


-8 


2853 


2 ® 


X^ + 2^X^ + V 
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D 


j 


£> 


Eg:Y^ = f{X) 


-11 


_2is 


40^-96-30 

3a 

a= -y27 + 21^/33 


X3 + v/3V4a=-96+9cj^2 
o->/ct 


-12 




-2« 


X3 + 2s/rrix'^ + X 


-16 


2^3^11^ 


2'^ 


X3 + 6X3 _|_ JY' 


-19 


-21533 


S) = 64 + 48'‘“'+“-32 

Ot 

a = ^l + 3\/57 


X3 + y'3'‘“'+“-33x2 + X 


-27 


2153 ■ 53 


16(-1- 100v^ + 80vT) 


X3 + v/-3 - lOOv^ + 80 vTx3 + X 


-28 


3353173 


_2i2 


X3 + 6\/=7X3 + X 


-43 


-2133353 


2) — 04 _|_ 4g40a^+Q-3200 

a = -^1 + 63%/l29 


X3 + ^340cH<^-3200j^2 


-67 


-2153353113 


^ 1 220a^+a — 96800 

a = ^1 + 651\/MT 


Y3 I ^/2220a2+a — 96800 Y 


-163 


-2133353233293 


^ 1 ^0 26680a‘‘+ct — 1423644800 




a = ^1 + 557403V2M 


vS 1 26680a2+a — 1423644800 Y^ | Y 



h. = 2 \ 


-15 


-ip(1415 + 637\/5) 




X3 + v/;|(27 + 7v/5)X2 + X 


-20 


320(1975 + 884^/5) 


- 26 ( 1 ^) 


X3 - 4\H2H/5X^ + X 


-24 


1728(1399 + 988v/2) 


2 ®(i + \/ 2 y 


X3 + 2y^6(3 + 2 v/ 2 )X 2 +X 


-32 


1000(26125 + 18473 \/2) 


2®(1 + v/2)3 


X3 + 2(5 + 4\/2) X2 + X 


-36 


192(399849 + 230888 \/3) 


-2®(2 + v/3)4 


X3 + 4y^-2(12 + 17\/3) X^ + X 


-40 


8640(24635 + 11016 v/5) 




X3 + 6y^2(9 + 4v/5)X2+X 


-48 


40500(35010 + 20213 \/3) 


2 i °(2 + v/3)3 


X3 + 2(15 + 8v/3)X2 + X 


-52 


216000(15965 + 4428 \/l3) 


26 (3+^3)'’ 


X3 + 12y^-( 18 + 5v/l3) X2 + X 


-60 


(274207975 + 122629507 v/5) 


_ 2 i 2 [li^]“^ 


X3 + 2y'-3(501 + 224^/5) X^ + X 


-64 


54(761354780 + 538359129 \/2) 


2 i°v/ 2 (l + v/ 2 )® 


X3 + 6(11 + 8v/2)X2 + X 



3 Modular Interpretation 

We now assume that Eq is defined over C so that it is analytically isomorphic 
to a complex torus C/(l,r). To establish the relation between r and G we use 
the classical Thetanullwerte (cf. [9]): 

Ut) = Y. 9,{t) = Y Hr) = 

n^Z n^Z 





254 



J. Guardia, E. Torres, and M. Vela 



nr K r-r r V ^ 32 (e2(r)®+e3(T)«+e4(r)»)^ . 

We can substitute j[t) = — 6>2(t)^ ^3(r)^ — ~ equation (a) and solve it 



for G. We find six solutions: 



G±2(t) := ± 
G±3(t) := ± 
G±4(t) := ± 



Osir)* + 
^?3(t)204(t)2 ’ 

^{9,{ry-e4{rr) 

02{t)"^ + 



^?2(t)203(t)2 • 

Concerning the discriminant 2), we find three possible expressions for it: 



®2(r) = 



1602(t)^ 



^'3(t)^6»4(t)4’ 

It is worth noting that 



^ 166>3(t)^ ^ m4{rf 

®3(^) /I /_\4n /_\4> 



02(r)‘^6»4(r)4’ 



6'2(t)46»3(t)4' 



S)2(r)+S)3(T)+S)4(r) = -48, 

S2(t)S)3(t)2)4(t) = -212, 

so that the irreducible polynomial of S)(t) over Q(j(''')) is 

+ ASDrirf + (768 - j(T))£>r(r) + 2i^ = 0. 



( 5 ) 



( 6 ) 



( 7 ) 



Proposition 2. Let us denote by S = 



the usual gen- 



1 1\ rj. ^ f 01 

0 1/ l^-l 0^ 

erators of the modular group F = SL 2 { 1 d), and consider the groups Fq( 2) = 
{S,TS^T), Fg = S~^F^{2)S = (S^,T), r°(2) = TFq{ 2)T~^ = {S^,FST) and 
the associated modular curves Xq( 2), Xg, X^{2). 

a) ® 2 (t) is a Hauptmodul for the modular curve Xo(2). 

b) ® 3 (t) is a Hauptmodul for the modular curve Xg. 

c) ® 4 (t) is a Hauptmodul for the modular curve X^(2). 

Proof. The action of S, T on the set of Thetanullwerte is well-known: 



02{St) = C802{t), 
OsiSr) = 04 {t), 
04{St) = 6»3(t), 



02{Tt) = Cs ^v^6»4(t), 
d3(Tr) = C8-'C^,03(T), 
6»4(Tt) = Cs ^Vt,02{t), 



where fs = exp(27ri/8). From these relations we derive the equalities 



^2{St)=T>2{t), 

S)3(^r)=D4(r), 

S)4(^r)=D3(r), 



S)2(Tt) =2)4(t), 
S)3(Tt) =D3(r), 
S)4 (Tt) =2)2(r), 
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which imply that the T>r{T) are modular functions for the given groups. The 
index of these groups in T is 3, and this is precisely the degree of the irreducible 
equation of T>r(T) over Q(j'(t)) 

Consequently, equation (7) is a rational model for the modular curves Xq(2), 

Xff, X°(2). 

In a similar way we can prove: 

Proposition 3. The symmetric invariants G± 2 {t), G± 3 (t) and G± 4 {t) are 
Hauptmoduln for the modular curves associated to the congruence groups To (4), 
S~^r^{4:)S and T°(4) respectively. 



3.1 Fourier Expansions 

The Fourier expansions of the modular functions begin with: 

D2(r) = = 2^‘^{q + 2Aq^ + 300g^ + 262^ + 18126g® + ...), 

n>0 

D3(r) = ^ - 24- 276^9- 2048(7- 11202 gi - 49152 _ 184024 gi +0{qf, 

D4(r) = ^ - 24 + 276 yg- 2048 g + 11202 gi - 49152 g^ + 184024 gi + 0{qf . 

( 8 ) 

In figure 1 we have plotted the logarithmic height of the coefficients d„ and c„ 
of the Fourier series of ^ 2 {t) and j{r) respectively. It is apparent that the dn 
are really smaller than the c„: 
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The series expansions of '£>r{T) show that these invariants will be smaller in 
absolute value than the j-invariant, but in order to compute the values of these 
functions one should use their expressions (7) in terms of the Thetanullwerte, or 
use the following relations 

Si(r) = /2(rr = 2i2^, 

S2(r) = -/(r)24 = 



®3(t) = /i(t)24 = ^, 



where 



I{t) = e 



^ n-i/24 ^ ( 2 ) 

T]{t) 



/i(t) = 



7?(t) ’ 



/ 2 (r) = V2 



^(2t) 

??(t) ’ 



are the Schlafli functions, ? 7 (r) = + 

(7”(3”+i)/ 2)) is Dedekind’s eta function, and A{t) = is the classical dis- 

criminant modular form. 



4 Ring Class Fields 

We recall the construction of ray class fields of quadratic imaginary extensions 
of Q. For the remainder of the paper, K = Q(-\/zi) will be a quadratic imaginary 
number field of discriminant D < 0, with ring of integers Ok = ^[t] and class 
number h = h{K). We take an elliptic curve 

E ■.Y'^ = + AX + B 

in Weierstrass form with complex multiplication by Ok- We assume that A,Bg 
K{ j), where j is the j-invariant of E, which generates the Hilbert class field of 
K. The Weber function on E is defined as: 

ix if j-(r) 7 ^ 0,1728, 
h{{x, y)) = < if j(r) = 1728, 

[x^ ifj(r)=0. 



Theorem 2. ([13], [12]) Given an integral ideal a of Ok, the ray class modulo 
a is K{j{E),h{E[a])y 

We shall use this result to express the ring class field of the order 
O 2 = ^[2r] of conductor 2 in TsT in terms of our invariants Hrir). 

First of all, we note that 



= X-^ - 



(D,(r) + 16)^ 



-X-h 



(D,(r) + 16)^ 



48 (£>^(t) - 8)^ (Sr-(T-) -k 64) 864 (D^(r) - 8)^ (Sr(T) + 64) 
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is an elliptic curve with j-invariant j{T) = defined over Q(j(t)). The 

X-coordinates of the non-trivial 2-torsion points of this curve are 

S,(r) + 16 - + 16) (S.(r) + 64+3 VSr(r)(S).(T) + 64)) 

12 (£>^(t) -8)’ 24 (Dr(r) - 8) (Dr-(r) + 64) ’ 



For D = —3, j = 0 the three invariants £>r('r) take the value -16, and we get 
= K. For D = —4, j = 1728 the values of the S)r('r) are 8 and -64, and 
again Ko^ = K = K(Dr(T)). In the general case, we have: 



^02 




Sr(r) + 16 
Sr(r)-8 



X)r(r)-|-64^ 

Vr(r) I 



= K 




1728, 



Sr(T) + 16 
Sr(T)-8 ’ 



Sr(T)-|-64^ 

®r(r) ) 



_ ( (®r(r)-8) + £),-(T)-|-64) Vr(T) + lQ 

s+r) >X)+r)-8’V ) 

= K ((®4t) - 8)», (S4r) + 16)(I>r(T) - 

Since the discriminant of the polynomial F^j{X) = X^+48X^ + (768— j(f))-^+ 
2^^ is precisely 

4MrnHr) - 1728) = 4(S,(r) - 8)^(a,(r) + + 64)) _ 

j 

we see that Ko^ is the splitting field of F^j{X) and we have proved: 

Theorem 3. For any fundamental discriminant D < 0, = 

X(S2(t),S)3(t)). 



Corollary 1. KlTirir)) is an abelian extension of K{j{T)). 



Lemma 1. Let Od and O^d be the imaginary quadratic orders of discriminant 
D and 4D respectively, with D < —4. Then: 

(h{OD) ifD = ±l (mod 8), 
hiOio) = \ if D = 0 (mod 4), 

[shiOo) i/£> = ±5 (mod 8). 



Proof. The formula follows from the comparison of both class numbers with the 
class number of the maximal order Ok of the common field of fractions K of 
Or> and O 4 JJ, by means of the well-known formula relating the class number of 
the order O of conductor t with the class number of Ok (cf. [4, p. 146]): 



HO) 



HOK)t TT / 

lo*K ■■ o*] 
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Combining this lemma with theorem 3 and the decomposition of the discrim- 
inant of -Fx) i(^) we have a very simple proof of the following well-known result 
([ 6 ]): 

Corollary 2. Let r = a quadratic number of discriminant 0 < D = ±1 

(mod 4). The singular value j^r) — 1728 is a square in K{j{T)). 

5 The Polynomial Hd['JD]{X) 

We fix a system of representatives of the classes of quadratic forms of discrim- 
inant D, and compute the corresponding quadratic numbers ti, ... ,Th € K, so 
that the Hilbert class polynomial is 

Hd{X) = (X-j{n))---{X-j{Th))=X^ + Ch-iX^-^ + --- + co G ^[X], 
We want now to study the polynomial: 

h 3h 

HomiX) = 1[{X - Dr{Tk)) = n + 48^" + (^68 - j{n))X + 212) ^ ^ 

r,k k=l k=0 

The second expression shows that Hjj[D]{X) G ^[X] since its coefficients are 
symmetric functions of . . . ,j{Th). It is not a class polynomial in a strict 
sense, since it is not irreducible in many cases. But it has a number of properties 
which make it interesting. For instance, some of their coefficients are independent 
of D and j{Tk). 



Hd[^]{X) =X^^^ +A8hX^’^-^ + ■■■ + 2^'^^. 

The a-priori knowledge of these coefficients of H]j['D]{X) is very useful for the 
numerical computation of the remaining coefficients: one can use them either to 
shorten computations or to check the validity of the results. 

The remaining coefficients of Hu['D]{X) can be related to the coefficients of 
the Hilbert class polynomial. For instance: 

di = 2^°3h + 2^‘^Ch-i, 

d, = 2 i 2 (^- 2 ) + 2S3(F - l)c,_i + Ck- 2 ^ , 

dsh-2 = 2^32 ^ ^ ^ -I- 2 ® 3 /i -I- Ch-i. 

All these relations have integral coefficients, an important fact that since it 
implies that they remain valid over any finite field of odd characteristic. 

We now proceed to study the factorization of Hd[T)]{X) over Q. 
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Lemma 2. Let tq = and jo = j(to). If D = 0 (mod 4) then S) 2 (to), 

T> 3 {to), X) 4 (ro) G M. If D = 1 (mod 4) only one of the three ©^(to) is real. 

Proof. The polynomial F'qj(X) with j G IR has three real roots if and only if 
j > 0. It is well known that j(— 1/2 + it) <0 for t > 1/2, j = 0 

and j{it) > 0 for t > 1. ■ 



Lemma 3. If 'Dri'i’o) G K{jo) fi IR then the irreducible polynomial o/ ©^( to) 
over Q has degree h. 



Corollary 3. 

a) For D = 1 (mod 8), Hd\^]{X) factors over Q[X] as a product 

fh{X)f 2 h{X), with fk{X) G iZi[X] irreducible polynomial of degree k. Both 
fh{X) and f 2 h{X) split linearly in K{jo). 

b) ForD = 0 (mod 4), i?r,[2)](X) = fh{X)f 2 h(X), with fk(X) G irre- 

ducible polynomial of degree k. The polynomial fh{X) splits linearly in K{jo) 
and f 2 h{X) splits as a product of h irreducible polynomials of degree 2. 

c) For D = 5 (mod 8), Hd[Ti]{X) is irreducible in Zi[X]. It splits as a product 
of h irreducible polynomials of degree 3 over K{jo). 

6 Computation of Class Polynomials 

The Hilbert class polynomial Hd{X) has been widely studied both from the 
theoretical and numerical viewpoint ([2], [6], [8], [14], [7]). The construction 
of class polynomials with smaller coefficients for defining Hilbert class fields 
has also been worked a lot ([3], [5], [10]). However, for practical applications 
(cryptography, primality proving) it is necessary to find the roots modulo p 
of the Hilbert class polynomial in order to build elliptic curves with complex 
multiplication. The available methods ([5]) derive these roots from the reduction 
modulo p of a simpler class polynomial. We don’t pretend to improve these 
algorithms, but to introduce the use of the invariants ©^(t) in this problem. 
While the corresponding class polynomials are not the smallest possible, their 
properties make them really manageable. 

Expressions (9) show that one can derive easily the Hilbert class polyno- 
mial Ho{X) from the polynomial Ho\^]{X). Unfortunately, these expressions 
also suggest that we can expect the coefficients dh to be of the same order of 
magnitude as the cu, so that it is not a good strategy to use the Hd\D]{X) 
directly to compute the Hilbert class polynomials. On the other hand, the se- 
ries expansions (8) provide very good bounds of the size of the roots S)r(Tfc) of 
Hd\^]{X). For instance, the highest absolute value of the roots of Hjo\f£i]{X) is 
approximately e^x.p>{^T^J\D\/2), which is more or less the square root of the largest 
root of Hd{X). Whenever D = 1 (mod 8) or £> = 0 (mod 4) the polynomial 
Hd\^]{X) has an irreducible factor fh{X) G Q[X] of degree h. The size of the 
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coefficients of these factors is at most the square root of the coefficients of the 
Hilbert class polynomial (in general they are quite smaller), so that they provide 
nice class polynomials. We will show with a couple of examples that these fac- 
tors allow an efficient computation of Hilbert class polynomials. Moreover, the 
use of the ©^.(t) for the computation of the Hilbert class polynomial produces 
simultaneously a polynomial defining the ring class field . 



6.1 Computation of the Hilbert Class Polynomial for the 
Discriminant D — —QO 

We shall find the irreducible factors of the polynomial H_i 5 [£)](X) and derive 
from them the Hilbert class polynomial il_i 5 (X). By corollary 3, the decom- 
position of H-i 5 [D]{X) in this case must be H-i 5 [D]{X) = f 2 {X)f 4 {x) with 
fk{X) G a monic polynomial of degree k. A system of representatives of 

quadratic forms of discriminant —15 is gi = X“^+XY +4Y‘^, g 2 = 2X‘^ + XY +2, 
with roots ti = and T 2 = yye know that j’(h) < 0 and 

j{T 2 ) ^ IR so that only two of the are real. Moreover, it can be seen that 

D 2 (t) takes real values on the half line r = — | -fit, so that we can assume that 
S 2 ('Ti) = —0.021286 ... is a root of / 2 (A). We write 

/ 2 (A) = (A - S) 2 (ti))(A - 27 S) 2 (ti)) 

and look for the value of r which produces integral coefficients for this polyno- 
mial. We find r = 0 and 



/ 2 (A) = A^ -k47A-k 1. 

From the equality 

/ 2 (A)/ 4 (A) = H_i 5 [S](A) 

we obtain a linear system in the coefficients of / 4 (A) and iJ_i 5 [S](A), whose 
solution gives: 



/ 4 (A) = A^ -h 49 A3 -k 192561A2 -k 200704 A -k 2 ^^, 

i7_i5(A) = A2 -k 191025A - 121287375. 

The ring class field of the quadratic order of discriminant -60 is generated over 
Q(-\/D) by any root of the polynomial / 4 (A). Note that we have needed only 6 
significant figures in S) 2 ('Ti) for the whole computation, while 17 figures of j(ti) 
and j{T 2 ) are necessary to compute the Hilbert class polynomial directly. 

Similar tricks can be applied to compute the irreducible factors of the poly- 
nomial iJ£)[2)](A) for any discriminant D =1 (mod 8 ) of D = 0 (mod 4) with 
class number 2 . 




Stable Models of Elliptic Curves 261 



6.2 Computation of the Hilbert Class Polynomial for the 
Discriminant D — —1588 

We know in this case that il-isgsiSK-’f) splits as the product of two irreducible 
factors fQ{X),fi 2 {X) G Z[X]. We could study the action of the Galois group of 
the extension Ko^/K on the £>r('r), and then precise the roots of the polynomials 
feiX), fi 2 {X). Anyway, it is not strictly necessary, at least for low class numbers. 

We begin finding the six reduced forms of discriminant —1588 and their 
corresponding roots r^: 



Tl = 


V-397 
2 ’ 


T2 = 


-l+V-397 
2 ’ 


7-3 = 


-3+V-397 

7 




3 +V -397 




- 3 +V -397 




- 3 +V -397 


Ti = 


7 ’ 


Ti = 


14 ’ 


Te = 


14 



We now compute the six values D 2 {Tk) with exp(7r-\/397) + 10 « 40 significant 
digits. (We may take into account that 2)2 = ®2 (^ ~^~^ 2 a^ ~^ 

some time). The remaining values and 2)4 (r^) are computed solving the 

system 

^siTk) + 2)4(Tfe) = -48 - 2)2(rfe), 

®3(T-fc)2)4(Tfc) = -2^^/2)2(rfc). 

We knew a priori that only four of the 2)r(Tfc) should be real: 2)2 (ti), 2)3 (ti), 
2)4 (ti) and 2)2 (t 2). The last one and one of the 2)r(ri) must be roots of the 
polynomial fe{X). We form the products = 2 )i.(ti) 2)2 (t 2). We now look for 
the product ^r 3 (T 3 )'^r 3 {T 4 :)T)r^{T 5 )T)r^{TQ) of pairs of complex conjugates which 
multiplied by the Tr gives a power of 2. We find that 

S3(n)2)2(r2)2)2(r3)2)2(T4)2)4(T5)2)4(T6) = 2^6, 
and we deduce that these are the roots of fsiX). Hence: 

/e(X)= X6 + 1531109538224690703626898816X5-5138592259654383232415544053760X^ + 

89443843056755983217687674691256320X3-21047673895544353719974068444200960X2 + 
25687755442455892467940464846176256X+236, 

As in the previous example, we could now find the polynomials /12(A) and 
H-i 5 ss{X) solving a linear system. The direct computation of H-i 5 ss{X) from 
its roots /(rfc) requires 90 significant digits for every root. The largest coefficient 
of H-i 5 ss{X) has 102 decimal figures, while the largest coefficient of /e has 35 
digits. It is worth noting that if we would like to construct an elliptic curve over 
IFp (p > 2) with complex multiplication by the quadratic order of discriminant 
-1588, it is unnecessary to compute the Hilbert polynomial, since the roots of 
the polynomial /e(A) modulo p provide directly equations for such curves. 
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Abstract. We develop an algorithm for computing isomorphisms and 
automorphisms of algebraic function fields of transcendence degree one 
in characteristic zero and positive characteristic. 



1 Introduction 

Let Fx/k and F2/k denote algebraic function fields of transcendence degree one. 
An isomorphism (p of Fi/k and F2/k is an isomorphism of fields p : F\ ^ F2 
whose restriction to k is the identity map. The main objective of this paper 
is to develop an algorithm to compute one or all isomorphisms of Fi/k and 
F2/k if these algebraic function fields are isomorphic and have genus greater 
than one. We think of Fi/k and F2/A: as the function fields of some suitable, 
explicitly given and not necessarily non-singular irreducible algebraic curves such 
as plane curves defined over k, and describe p by its action on the corresponding 
coordinates or field generators. For the special case F = Fi = F2 the algorithm 
computes the elements of the automorphism group Autfc(F) of F/k. We restrict 
to genus greater than one since otherwise the number of isomorphisms or the 
automorphism groups may be infinite and the task is related to finding unique 
models by computing rational points on curves over k, a hard and from the 
techniques of this paper quite different problem. 

Assume that Fi and F2 are given as finite and separable extensions of an 
algebraic function field K/k of transcendence degree one. Since F\ and F2 can 
be obtained by adjoining roots of monic, irreducible and separable polynomials 
/ij/2 G K[t] to K respectively the computation of isomorphisms i) of Fi/k 
and F2/A: which restrict to the identity on K can essentially be reduced to the 
computation of the roots of /i in F2 or of /2 in Fi. This latter task can in turn 
be solved by special Hensel lifting and reconstruction techniques, or by general 
polynomial factorization algorithms. A concrete algorithm for computing such 
isomorphisms of F\ and F2 and for computing the elements of Ant k{F) for finite 
extensions Fi/K, F2/K, F/K and K = Q(x) is given in [ 6 ]. The computation of 
Autfc(F) for example cannot be done this way since the fixed field of Autk(F), 
which would be a candidate for K, is a priori not known. 

In the following we need to define and compare various entities for the func- 
tion fields Fi and F2. To facilitate this we use the subscripts ( 1 ), ( 2 ) for either 

D.A. Buell (Ed.): ANTS 2004, LNCS 3076, pp. 263-271, 2004. 

© Springer- Verlag Berlin Heidelberg 2004 
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of the two cases and (a) for both (hence a G {1, 2}). Entities with no subscript 
are related to both or completely independent of the two fields. We are thus 
considering the function fields E(i), F^2) smd the isomorphisms (j). 

Throughout the paper E(Q,)/fc denotes an algebraic function fields of tran- 
scendence degree one and genus greater than one. Unless otherwise stated, k 
is assumed to be the exact constant field of F(^a)/k (ie. algebraically closed in 
F(q,)) and perfect. Our algorithms rely on the ability to compute in F(^a)/k as 
a field and fc(a;(Q,))-vector space for x^a) a separating element of F(^a)/k, and 
to compute with places, divisors and Riemann-Roch spaces = {a G 

P{a) I (®) + ^(q) — 0 }O{0} for divisors I?(a)- In particular an algorithm for com- 
puting Weierstrass places is required. We refer to [3,4] for algorithmic aspects 
and to [8] for the theoretical background. Implementations of these algorithms 
are available in Kash [5] and Magma [1,2]. 

2 The Basic Idea 

Assume that (j) is an isomorphism of ^(1)/^ and F^2)/k. We derive a number 
of necessary conditions for the existence of (j), which altogether will also be 
sufficient. This also leads to a test for and F(^2)!k being not isomorphic. 

First, the genus of and F(^2)/k is equal and (j) maps places P(i) of 

F(i)/fc to places P(2) of F(2)/fc, preserving the degree and gap numbers and 
inducing fc-linear isomorphisms of the vector spaces L{nP(i)) and L{nP(2.))- In 
particular, places of F^i^/k of smallest degree are mapped to places of F^2)/k 
of smallest degree and Weierstrass places of F^i'j/k are mapped to Weierstrass 
places of F(2) / k. For global function fields (that is fc a finite field) there are only 
finitely many places of any fixed degree, and in general there are only finitely 
many Weierstrass places. This restricts the possibilities P(2) = (j){P(i)) to a finite 
number. We remark that both sets of places (smallest degree, Weierstrass) can 
be computed efficiently. 

Let P(i) and P(2) be places of degree one with P(2) = (j){P{i))- Let g be 
the genus of F(i) and F(2) and let Ui G for 1 < i < r < 5 -|- 1 be the 
first successive pole numbers greater than zero at P(i), equal to those at P(2), 
such that gcd{rii 1 1 < t < r} = 1. Let X(a),i G F'{niP(a)) be elements with 
We define X(a),o = 1 and no = 0. Obviously, X(a),i is uniquely 
defined up to multiplication by elements from k^ and addition by fc- linear com- 
binations of the X(a),j for j < i. As a result, i^(x(i) i) = for 

suitable /Jij G k and /ijy yf 0. 

These observations are interesting because P(q) = A(a;(„)_i, . . . , so 

(j) is completely defined by the g,ij. This equality holds because P(q) is fully 
ramified and unramified over k{x(o,),i^ ■ ■ ■ , X(a),ni) due to the gcd condition. We 
let /(q.) be the kernel of the substitution homomorphism k[ti , . . . — >■ P(a), 

ti !—>■ X(ci),i- Thus generators of /(i) and 1(2) define irreducible affine curves whose 
function fields are equal to P(i) and P(2) respectively. Since <f> is an isomorphism, 
substituting for yields an automorphism of /c[ti, . . . , which 

accordingly maps /(i) to 7(2). 
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Turning things around assume now that the are indeterminates and R = 

— 0]- Then /(i) is also an ideal of . . . , and we define 7^2) 

to be the ideal of R[t\, . . . , obtained from /(i) by substituting X)j=o 
for ti- Any solution G k with fj,i i ^ 0 such that = ^( 2 ) yields an 
isomorphism of the coordinate rings k[ti, . . . — >■ . . . Rm]/I( 2 ) and 

thus an isomorphism (p of F(^i^/k and It is actually sufficient to require 

only I(' 2 ) C 1 ( 2 ). 

Summing up, our algorithm for computing one or all isomorphisms of T'(i)/fc 
and F( 2 )/k proceeds as follows. We first choose a suitable place of smallest degree 
or Weierstrass place P(i) of 7"(i)/fc and compute all places P( 2 ) which could 
correspond to P(i) under an isomorphism, respecting the condition of smallest 
degrees or Weierstrass with same gap sequences. Then we compute /(i), 1(2) and 
I'^ 2 ) and solve for the with ^ 0 such that 1^2) C 1(2). This gives all 
possible isomorphisms </> of P(i)/fc and F( 2 )/k. 

This concludes the description of the basic idea. Of course, there are various 
problems to overcome. For one we need to find a convenient way to compute /(i) 
and 1(2) and to check 7^2) ^ -1(2) in terms of the Hij. Another problem is that 
the number of the fiij is roughly g^/2 which makes the computation of F^^) and 
finding a solution to 7^2) -1(2) hard. To this end we essentially reduce the fiij 

to two parameters. These issues are addressed in the following sections. 

We finally remark that another way of computing isomorphisms is by using 
canonical curves. One advantage of this would be that the auxiliary places P(i( 
and P( 2 ) could be avoided. Since these curves are uniquely determined up to 
linear transformation we could compare a canonical curve for F(i(/k with a 
generically linearly transformed canonical curve of F( 2 )/k similar as above. As 
is this would involve roughly indeterminates, and attempts to reduce this 
even larger number may again require the use of auxiliary places. Also, the 
computation of canonical curves is in general not very easy. Moreover, for our 
above strategy the number of the required X(q,) ^ can be considerably smaller than 
g, resulting in easier to handle affine curves, and it also works for hyperelliptic 
function fields. Therefore, we do not pursue the use of canonical curves in this 
paper. 

3 Relating AfRne Models 

We now do not just choose the first r pole numbers rxj but let the be special 
generators of the Weierstrass semigroup at P(a) satisfying the following condi- 
tion. We require that rij ^ rii mod rxi for 1 < x < j and 1 < j < r. We observe 
that the Ui are uniquely determined, because rxi is the smallest pole number 
greater than zero at P(a)- The X(a)^i are again chosen in C{riiP(a)) such that 

Given the congruence inequality it is not difficult to see that the elements 
1, cc(q,) 2 ) • ■ • ) j. are a i]-basis of the integral closure Cl(7[a;(Q,) i], F). 

Because [F : /c(x(q,) 4 )] = rxi this implies r = n\. Since Cl(7[x(o,)p], F) is a free 
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fc[a:(Q)^i]-module we know that there are uniquely determined G k\t] such 

that j' 1 (^(a),l ) ~t“ ^ j — 

ni- Looking at the valuations r’P(„) we obtain deg(A(c)^ij^^) < {ni + rij — n,^)/ni, 
and equality must hold for at least one v because of the incongruence relations 
of the rii. An algorithm to compute these X(^a),i,j,iy is given in [3]. 

Let /(q.) be the kernel of the substitution homomorphism — >■ 

A"(a), U I— X(a),i- From the previous discussion we see that /(q.) has a nice set of 
generators of the form titj — X(a),i,j,i{ti) — Y 1 Z =2 for 2 < i, j < ni. 

In particular, given any polynomial / € k[ti, . . . ,t„J we can reduce it modulo 
/(c) to a polynomial of degree at most one in ^ 2 , ■ • ■ by substituting terms 
titj with A(c)yj)i(tl) “t“ 9 X(fy') j .j jj{t\)ti>. 

Since we are working with a possibly smaller number of X(^a),i than in the 
previous section these elements do not necessarily realize all pole numbers any- 
more. Because of the congruence conditions on the rii this can now be done 
using elements of the form x^^^^ iX(a),i> and a basis of L{rirP) is given by 

i^(a)A I /foi follows that the transformation of Section 2 

takes the form </)(a;(i)_i) = 1 )^( 2 ), j for suitable fiij G k[t] with 

deg(/iij) < {rii — nj)/ni and ^ij yf 0. Reducing the generators of modulo 
/( 2 ) in R[ti, . . . , and equating the coefficients of the ti and 1 with zero finally 
yields equations for the coefficients of the iiij. 

4 Relating Expansions at a Place Depending on Two 
Parameters 

We now discuss how to relate p-adic expansions at P(i) and P( 2 ) in a meaningful 
way, depending on only two parameters. 

We let 7T(c) denote a local uniformizer at P(a)- Computing expansions in 
terms of 7 T(c) we may embed F(c) C /c((7r(c))) and the isomorphism (j) extends to 
an isomorphism of fc((7T(i))) and /((7r(2))). Local uniformizers are not uniquely 
determined and we cannot hope for ^(7 T(i)) = 7T(2) if 7 T(i) and 7T(2) are chosen 
independently. However, (/)(7r(]^)) is a local uniformizer at P( 2 ) and there are 
Ci G /, Cl yf 0 such that ^(7 T(i)) = course, the Ci are unkown to 

us. Looking at a^(i),i and X(^ 2 ),i we see that </>(a;(i) 1 ) = ax( 2 ),i+b for some o, 6 G k, 
a yf 0, which are also unknown to us. Furthermore, X(a),i = fo'^ 

d(a),i G k, d(o,),o 0- These coeffdents can be computed since X(^a),i^'^{a) are 

known and lie in the same field respectively. Putting things together relates the 
Ci to a, b and gives the equation 

00 00 

i — 0 /— 0 

We want to solve this equation for (^(Tr)!)) and recursively for ci, C 2 , .... Let 
n = ni- Equating the coeffcients of for z = 0 gives d(i),o = (^c"d( 2 ),o> hence 
c” = d(i),o('^( 2 ),oa)“^. Write n = p^"n' with n' yf 0 mod p, where p denotes the 
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characteristic of k. Let s > 1 be such that ^ 0, and write s = p^‘s' with 
s' ^ 0 mod p. For p = 0 we let n = n', s = s' and r„ = Vg = 0. The terms of the 
smallest degree in 7T(2) involving Cj are of the form c?(i)_5(s'cj , 

a{n'ci and b{n ' for the three main ex- 

pressions in equation (1) respectively, where s-|-(j — is minimal with d(i),s 7^ 
0. As a result, using a = d(i),o(ci c?(2),o)~^) there are monic fj G fc[ci, con- 

sisting only of p-power terms in t and gj € k[b, ci, C2, C3, . . . , Cj_i] for every 
j > 2 such that equation (1) implies fj{cj) = gj. Regarding a,b and the Cj as 
indeterminates we define i?a,6 = k[a,b,Ci,Ci^ ,02,03, .. .]/Ia^b where la^t is the 
ideal generated by 0 ~ oc"(i(2),o &nd fj{oj) — gj for j > 2 and obtain the 
generic embedding (f>a^b ■ ^((’’■(i))) Ra,b{{'^(2))) since the image of 7 T(i) under 

(pa,b is invertible in Ra,b- Now (pa^t specializes to p if we substitute the correct 
elements of k for a, b and the Oi corresponding to p. 

For n yf 0 mod p or p = 0 the terms of the smallest degree in 7T(2) involving 
Oj are a(n'c" ~^Oj)^ ”’’'(2) “ anc"“^cy7r;j2)^) &nd the fj are hence all linear. 

This means that ci is an n-th root depending on a and that all other Oi are 
uniquely determined by Ci and b. For n = 0 mod p a more detailed analysis 
of the fj shows that there can be at most n solution vectors satisfying 

equation (1). The thus depend also only on a,b up to finitely many 

possibilities. Looking at higher powers 7T(2) may give (usually gives) additional 
linear conditions on the Cj if the corresponding are not zero. We remark 

that this strategy is not efficient if the series expansions of 2^(1), 1 and 2^(2), 1 
are such that a larger number of the fj do not have degree one (n = 0 mod p 
necessarily) . 

A particularly easy form of the powers of P{tt(^i'j) in terms of 7T(2) can be 
achieved for n yf 0 mod p or p = 0, if we choose 7r(i) and 7T(2) m a special way. 
Let = t" — l/x(c) 1. Since vp^^-^{l/x(^a),i) = ^1 it follows that h(^a) has a root 
7 T(q:) G fc[[7r(„)]], and 7 T(q,) is a local uniformizer. If we require 7f(o,) = 7r(a)+0(7T(a)) 
then 7T(a) is uniquely determined by X(a),i- Inverting the representation of 7r(c) 
in terms of tt^^) gives a representation of 7 T(q,) in terms of 7T(„) , and we may hence 
write everything in terms of 7f(c,) or equivalently embed F’(c) C /c((7r(„))). Also, 
p extends to an isomorphism of fc((7T(i))) and A:((7T(2))) but again p{TT(jp^) = 7T(2) 
is not necessarily true. Equation 1 then simplifies to the following equation, 

1 = 0(()(ft(i))”/7t('2) + 60(7f(i))”. (2) 

With p{^T^l)) = Ci7f(2) + 0(7f^2)) c” = 1/a equation (2) yields 

00 

p{^ii)Y ( 3 ) 

where r 0 and pi £ k. This explicit form will be used in the next section. 
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5 Relating AfRne Models at a Place Depending on Two 
Parameters 

We now want to reduce the number of indeterminate coefficients in the polyno- 
mials to basically two indeterminates. 

Using the notation of the previous section we have = a, /ii^o = b and 
Mo,o = 1 • A local uniformizer at is given by Tv^a) > and for ni yf 0 mod p or 
p = 0 we may assume that 7 T(q,) is an ni-th root of l/x(a),i- 

For every pole number di of P(a) let where di = 

vni + rij. As mentioned the W(q),o, • ■ • , W(a),i form a basis of L{diP(^a)) be- 
cause of the congruence relations of the rij. We can compute the expansions 
W{a),j = I'® precision. However, precision 0{TT(^a)) 

will be sufficient. Using a Gaussian elimination procedure we may assume that 
P(a),i,L' = 0 for all i = —dj and v > j and P(a)-d ,j = 1 for all 0 < j < di: 
For decreasing j we simply eliminate the —dj-th coefficients in the expansions of 
W(a),i. by replacing W(a),i, with W(^a),u ~ (P(a),-dj,^ / P(a),-dj ,j)w(a),„ for all ly > j. 
Finally, we replace W(^a)p by {^/ P(a)-djj)w{a),j- We now define = w,^a),i 

where Uj = di. The resulting xt^a),j E^re then uniquely determined depending on 
the chosen local uniformizer. 

In the two fields situation we assume that the X( 2 )d have been transformed 
using this Gaussian elimination procedure. We now know that the Gaussian 
elimination for would yield the X( 2 ),i, but the j) are unknown to 

us. We can however perform this Gaussian elimination for the 4>a,b{x(^iyi) over 
Ra.b in a generic way. Note that the “leading” coefficients of ia^(i),d are 

basically powers of c\ and are hence invertible in i?o,6- The Gaussian elimination 
procedure thus precisely yields the pij as elements of Ra,b[t]- Since we can work 
with precision 0(7r(2)), we have only to deal and to compute with elements of 
i?a,b involving Ci for i < Ur + 1. 

Taking up the strategy of Section 3 (see in particular its end) and clearing ci 
from the denominators in the coefficients of the pij and in R^b yields equations in 
a, b and the Ci with i <ni + l. The corresponding ideal in A:[a, 6, ci, . . . , c„j+i] is 
zero dimensional, since there can only be finitely many solutions for a, b and the 
Ci can only assume finitely many different values given a, b. Thus the intersection 
with k[a] and k[a, b] also results in zero dimensional ideals, and the possible values 
of a and b and then Ci can be computed from this. 

For ni yf 0 mod p or p = 0 things are much more explicit. Using the special 
local uniformizer we perform the Gaussian elimination on the X(i)y and the x^ 2 ),i- 
Because of equation 2 this special form is almost preserved by (f>, namely we have 
that 4‘{x(^a),i) = c"‘X( 2 ),i with c" = 1/a for i > 2 and (/(x(i) i) = ax(^ 2 ).i + b. 
Then 

ni 

= ^{l)d,jp{a,X(2)p -I- 6) -b ^{l)d,j,v{o,X(2),l + b)c^'' X(2),v^ 

v=2 

rii 

~ ^ ^ ^(2),i,j,|y(^(2),l)^(2),|y■ 

|y=2 
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Combining the right hand sides via j) = 4 >{x(^i)^i) 4 >{x(^i-)j) yields easy 

equations for a, b and ci. 

6 Computing the Isomorphisms 

The choice of the sets of places of -F(i)/A: and -F(2)/A which must be mapped to 
each other under any isomorphism (j) depends mainly on the constant field and 
the genus. For constant fields other than finite fields we always consider subsets 
of the set of Weierstrass places of F(i)/k and F(2)/k of smallest cardinality, where 
the places have a particular common gap sequence. The number of such places 
can be quite high, a lower and upper bound in characteristic zero are given by 
2 {g + 1) and {g — l)g{g + 1) and in general by 0 {g^). This leads to roughly 
up to g^ comparisons of places P(i) and P{2)- For finite fields we check whether 
the estimated number q± 2 gq^^'^ of places of degree one is (considerably) smaller 
than an approximate expected number of Weierstrass places. This would lead to 
roughly comparisons of places and is in O(g^), but can also be much smaller. 
Note that places of a prescribed small degree can be computed efficiently for 
global function fields. A bound for the number of isomorphisms is given by the 
Hurwitz bound 84 {g — 1) in characteristic zero and by roughly \Qg‘^ in positive 
characteristic (details can be found in [7]). Such large automorphism groups are 
only obtained for very special function fields. 

It may happen that we cannot find suitable places of degree one, but this 
was essential for the strategy described above. A solution to this problem is to 
consider constant field extensions, over which we would obtain suitable places 
of degree one. For example, if P(q) is a place of degree two over k which splits 
into Weierstrass places over an algebraic closure it is sufficient to consider the 
constant field extension F(^ci)k{P(a)) / k{P(a)) by the residue class field of P(a), 
since P(q.) already splits in F(^a)k{P(a)) /k{P(a)) ■ We would then compute an iso- 
morphism of A:(P(i)) and fc(P(2)) and isomorphisms </> of P(i)fc(P(i))/fc(P(i)) and 
F(2)k{P(2))/k{P(2)) which possibly do not come from isomorphisms of P(i)/fc 
and P(2)/fc- This can be checked as follows. A generating system of P(a)/fc is 
also a generating system of P(Q)fc(P(a))/fc(P(a)). If we compute the effect of 
in terms of these generating systems it is easy to check whether (p restricts to 
an isomorphism of P(i)/fc and Fq2)/k. This operation requires the inversion of 
isomorphisms or inverting the representation of one set of generators in terms of 
another set of generators. A way of doing this computation is described in [3] and 
can also be achieved for our special elements X(^a),i by linear algebra over k involv- 
ing algebraic functions of bounded degree. Since every isomorphism of and 

F(2)/k extends to an isomorphism of F(i)fc(P(i))//c(P(i)) and Fq2)k{P(2)) /k{P[2)) 
this method yields all isomorphisms of F^i-^/k and P(2)/fc. 

We summarize the single steps for computing the isomorphisms between 
P(i)/fc and F(2)/k. 
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Algorithm 4 (Isomorphisms) 

Input: Function fields i^(i)/fc and F( 2 )/k. 

Output: A set L of all isomorphisms of Fp^-^/k and F( 2 )/fc. 

1. Check whether -F(i)/fc and F[ 2 )/k have the same genus g. If not then return 
the empty list L. 

2. Let p = char(fc), q = ffk and d= 1. 

3. If g^ < q‘^ compute lists S'(i) and S'( 2 ) of all Weierstrass places of F(i)/k 
and of Fp 2 )/k (degrees greater one allowed). Check whether numbers, degrees 
and gap sequences coincide. Determine suitable small subsets and S'^^) 
as discussed above. 

4- If 9^ ^ 9^ compute lists S'(i) and S'( 2 ) of all places of degree d. If there are 
no such places let d^ d + 1 and go to step 3. 

5. Choose a fixed P(i) G ^/deg(P(i)) > 1 work with F(i)fc(P(i))/A:(P(i)) 

in the following. Compute the ni, and dj, 1 ^( 1 )^ as in Section 3 and 

Section 5. If Ui 0 modp or p = 0 set c= 1, otherwise set c = 0. Choose 
a local uniformizer 7T(i) at P(i)- If c = 1 then choose the special 7T(i) of 
Section 4- Apply the Caussian elimination procedure of Section 5 to the 

6. The following three steps are done for every P( 2 ) G 5^2) • 

1. Check that k{P( 2 ))/k is isomorphic to A:(P(i))/fc. If not, take the next P( 2 )- If 
yes work with F( 2 )k{P{ 2 )) /k{P[ 2 )) in the following and identify k{P( 2 )) (ind 

k{P(2))- 

8. Compute the X( 2 ),i and W(^ 2 ),j for P{ 2 ) as in Section 3 and Section 5. If c = 1 
then choose the special 7T(2) of Section 4- Apply the Caussian elimination 
procedure of Section 5 to the X( 2 ),i- 

9. If c = 1 then solve for a,b,ci as described at the end of section 5. Otherwise 
solve for a, b, ci, C 2 , . . . as described in section 5. For any solution recover the 
Hij and 4>. If 4> restricts to an isomorphism (j/ o/P(i)/fc and F( 2 )/k, then 

10. Return L. 

We remark that we have implemented a prototype of Algorithm 4 in Magma 

[1,2] which shows that the algorithm is quite practical in the global function field 

case. 
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Abstract. We present a technique to recover / £ Q(Cp) where Cp is a 
primitive pth root of unity for a prime p, given its norm g = f * f in 
the totally real field Q(Cp + CjT^)- The classical method of solving this 
problem involves finding generators of principal ideals by enumerating 
the whole class group associated with Q{(p), but this approach quickly 
becomes infeasible as p increases. The apparent hardness of this problem 
has led several authors to suggest the problem as one suitable for cryp- 
tography. We describe a technique which avoids enumerating the class 
group, and instead recovers / by factoring Nf, the absolute norm of /, 

(for example with a subexponential sieve algorithm), and then running 
the Gentry-Szydlo polynomial time algorithm for a number of candidates. 

The algorithm has been tested with an implementation in PARI. 

1 Introduction 

We present an algorithm to solve degree two norm equations corresponding to 
the field extention Q(Cp) / Q(Cp + CjT^)) where (p is a pth primitive root of unity. 

The previously best known technique for solving this problem involves finding 
a principal ideal generator of the ideal of (/) by enumerating representatives of 
the whole class group, and then applying index calculus techniques to obtain a 
generator of (/). This approach is explained in a little more detail in section 2, 
but we note it becomes very expensive as p increases (even the class group is on 
the order of p^). 

In [6,12,13] the authors explicitly assume the hardness of the cyclotomic norm 
equation problem to build cryptographic applications. It has been observed by 
several sources [5,8,10] that their constructions can easily be modified to ones 
which are not reliant on this particular assumption, for example by adding some 
kind of perturbation or noise factor. Such enhancements are interesting avenues 
for further cryptographic research, but in this paper we concentrate on the purely 
mathematical problem of solving the norm equation. 

Our work is motivated by a concrete question concerning polynomial arith- 
metic in the ring R = Z[A]/(A^’ — 1). To describe our problem, we define, for 
each polynomial f € R, its reversal, frev, to be the polynomial An 

D.A. Buell (Ed.): ANTS 2004, LNCS 3076, pp. 272-279, 2004. 
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element in R which is equal to its own reversal is called a palindrome. Given a 
palindromic element in R of the form g = f * frev, the task is to find such an /. 

Our algorithm builds on an earlier work [9] which is able to recover an element 
/ G 'L\X\I{X'P — 1), up to multiplication by from the principal ideal (/), 

and from the quantity / * frev ■ 

This problem of “factoring” / * frev is easily seen to be related to a norm 
equation as follows. Elements of R naturally map to the quotient 1\X]/{1 + X + 
. . . XP~^), the ring of integers in the pth cyclotomic field Q(Cp) • Under this map, a 
polynomial and its reversal map to Galois conjugates in the field extension Q(Cp) 
/ Q(Cp + the product of the conjugates is the norm in Q(Cp + C^^)- 

Solving such a norm equation yields a cyclotomic integer, and knowledge of /(I) 
is sufficient to determine / as an element of 'L\X\j(X'P — 1). The integer /(I), 
in turn, is known up to sign since / * /reu(l) = 

Some authors call 5, the autocorrelation of /, and work more generally in the 
ring R[AT]/(Ar^' — 1). In [6,7] instances of the problem are considered where the 
coefficients of / are taken to all be in {0, 1}, and the problem of recovering an 
/ from the product / * frev is called hit retrieval. 

In this paper, unless otherwise noted, polynomials are elements of the ring 
R. When it is clear, we use the same symbol for the projection to the ring of 
cyclotomic integers. 



2 The Classical Approach to Solving This Problem 

The general principal ideal problem is, given an ideal X which is known to be 
principal, to find a generator of X. 

The standard technique for solving the principal ideal problem involves con- 
structing a huge “factor base” consisting of ideals with small norm, representing 
every element of the class group. 

The collection of a similarly large number of relations (which are “smooth” 
over the factor base), will yield a generator for the target ideal. See [3], sec- 
tion 6.5.5 for more details of this approach. 

The class number of cyclotomic fields is on the order of see [4] for a more 
detailed analysis of this distribution. As a result of this, the factor base approach 
becomes infeasible very quickly, and had lead various people to suggest it as a 
basis for cryptographic schemes, see [2,6,12,13]. 

3 Statement of the Problem 

Problem 1. Let p he prime and let f G R = 1\X\I[X'^ — 1). Let g = f * frev 
Given g, determine such an f . 

Glearly, the problem does not have a unique solution; if / is a solution, then 
so is ±X^/. To understand precisely the ambiguity of /, let us consider several 
ways in which we might enlarge the set of acceptable solutions. Below, / is 
considered as a cyclotomic integer, and / denotes complex conjugation. 
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Variant 1. Find fractional ideals F such that F * F = (g) . 

For an ideal F which is a solution, the ideal FA will also be a solution 
if AA = (1). Each ideal of the form B/B has this property, so there will be 
infinitely many solutions. 

Variant 2. Find integral ideals F such that F * F = {g). 

To restrict solutions to integral ideals, let Ff^ Ff^ be the prime factor- 
ization of (/). Then for any 0 < < a,, the ideal 

pbl p Ak~^k) 

will be a solution. So there will be exactly IK®* + 1) solutions in total. 

Variant 3. Find cyclotomic field elements f such that f * f = g- 

To restrict the solutions of variant 1, which are ideals, to those which are 
principally generated. If a field element / is a solution, we see that the element 
fa will also be a solution, provided that aa = 1. Such a, are simply the elements 
of the form a = b/b, and so if there is one solution, there will also be infinitely 
many solutions. 

Variant 4. Find cyclotomic integers f such that f * f = g- 

This is in fact the flavor of the problem we are most interested in. Restricting 
solutions to cyclotomic integers, we see that the generator / of each solution ideal 
of variant 2 will yield a solution for this fourth variant. For a solution /, the 
multiples fu which are also solutions correspond to the values of u which are 
2pth roots of unity. 

Thus, there are 2p times the number of principal ideal solutions. In particular 
we obtain 2pY\{ai -I- 1) as a bound on the total. In the case that (/) is a prime 
ideal, there are exactly 2p solutions. 

4 Algorithm Overview 

At a high level, there are two components to the algorithm. The first step com- 
putes candidate ideals F for the desired ideal (/). The second step consists of 
recovering / from the ideal (/), and the element g = f * frev The second step 
can be accomplished with the result of [9], reviewed below in section 5.1, or in 
some cases, by a simplified version of it. 

We illustrate the algorithm first in a somewhat special case, then proceed 
to show how it can be extended. In this first case, we are going to assume that 
the prime p is congruent to 3 (mod 4), and additionally that the norm of /, 
down to Z does not have any repeated factors. Our first assumption implies that 
Q(-y^) is a subfield of the pth cyclotomic field, so that there is a diagram of 
field extentions. 

Q(C) ^Q(C + r') 
t t 

Q(\/^) ^ Q 



The essence of the first step of the algorithm is the determination of a set of 
potential ideals for (/). As in the classical approach to our problem, this may be 
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accomplished by factoring / * f^ev into prime ideals. The candidate ideals will 
be the ideals of R which have the same norm as /, and also contain (/ * frev)- 

To factor an ideal in the ring cyclotomic integers, one appeals to the classical 
prime decomposition theory [1], using the prime factorization of the absolute 
norm. Algorithmically, beginning with f*frev in Q(C+C~^)> it is easy to compute 
the norm Nf down to Q. Our assumption that each rational prime q divides this 
norm with multiplicity at most 1, implies that q splits completely in Q(C)- Thus, 
it makes the use of arithmetic in the quadratic subfield even simpler. 

There are two primes above each such q in Q{^/—p)■ Among the various 
possible products, I, of these ideals in Q(\/^) are the norms of the solutions /. 
By lifting each such I to Q(C)j and computing the greatest common divisor with 
(/ * frev), one obtains an ideal F, with the property that F * F'^ = {f * frev)- 
This can be considered a solution to the problem variant 2, above, and will be 
deemed a candidate ideal. 

The main algorithmic difficultly of determining the potential ideals in 
Q{^/—p) is the integer factorization of Nf in Z. Once this has been accomplished 
the above procedure yields a short list of candidate ideals. 

Once one is in possession of both (/) and / * frev, the algorithm [9] can be 
used to find / up to a root of 1. We provide further details in the next section. 

5 Algorithm Details for p = 3 mod 4 

In this section we provide further details of each step. 

Norm Calculation. The initial step consists in calculating the norm of / 
down to Q. This norm is the product 0/(C*) over all pth primitive roots of 1. 
Grouping the factors into conjugate pairs, we see that this can be efficiently 
computed from the {p— l)/2 conjugates of {f * frev){Ci)- Equivalently, the norm 
of / is the square root of the norm of f * frev in Let Nf denote this integer. 

Prime Factorization. Next, the prime factors of Nf must be determined. 
Unless Nf turns out to be prime, or the factorization easy, this step is the most 
computationally difficult part of the algorithm, and prohibitive if p is large. De- 
pending on size, a standard elliptic curve or sieving algorithm may be successful. 
Let < 7 “U . . ( 7 ^'“ denote the prime factors of Nf. 

Ideals in Q(.y— p). Each prime in Z either splits or is inert in Q{^/—p). By 
making the assumption that Nf has no repeated factors, we ensure that each 
prime qi, splits. The two factors are then (q) = {q, r+ ^^'^ -), and (g, r— 

A candidate ideal I in Q{-,/—p) is computed by multiplying together exactly 
one ideal above each g^. The ideal arithmetic in this step is particularly efficient, 
given that we are only working in a quadratic field. 

Testing Principality 

A necessary condition that the ideal X be the norm of a solution /, is that 
I is a principal ideal. If the class number of Q(y^— p) is not large, the ideal X 
can be efficiently tested for principality as in [3] . This may be feasible, since the 
class numbers of quadratic imaginary fields seem to grow at a reasonable rate. If 
it is concluded that X is not principal, the subsequent steps of our algorithm can 
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not succeed, so X may be discarded. We remark that this step is optional. If all 
of the potential ideals X were to be processed in parallel, eventually a solution 
would be found. 

GCD Computation. In this step we determine the candidate ideals F of 
Q(C). Another consequence of assuming that IV/ has no repeated factors is the 
fact that / and jrev are relatively prime. Additionally, jrev is prime to the other 
(p — 3)/2 Galois conjugates of / dividing X. 

Now consider both X and (f*frev) as ideals of Q(C), and compute the greatest 
common divisor ideal. This is efficiently accomplished by forming the Z span of 
generators of the two Z-modules [3]. The resulting ideal is an ideal F, with the 
property that F * F = (g), and is thus a solution to variant 2 of our problem. 
Only if F is principal will it correspond to an element / such that / * frev = 9- 

Recovering / 

Given the candidate ideal X and the product / * frevj apply the algorithm, 
reviewed below in section 5.1. Note that if X does not correspond to a solution, 
this process must fail. One method of dealing with this detail is to process all 
possible candidate ideals F in parallel. 



5.1 Review of Gentry-Szydlo Algorithm 

In this section we review the algorithm of [9], which recovers an element from 
Z\X\j(X^ — 1) given the ideal it generates, and the product / * j^ev 

A simplified version of the algorithm is as follows. We will express ideals as 
Z-modules over the power basis {1, A, A^, . . .}. For a p element vector /, the 
circulant matrix Cir{f) is the matrix of all rotations of /, and the columns of 
Cir{f) generate the principal ideal / in Q(C). 

First, note that if F denotes Cir{f), then for any other basis F[ of (/), 
F[ = FU for some unimodular matrix. Next, we combine this information with 
D = Cir{f frev) by forming the product 

G = H*D~^H 
= U^F\Ff)-^FU 
= U^U. 

This last matrix can be viewed as the Gram matrix of an auxiliary lattice. The 
auxiliary lattice might be called a hypercubic lattice [19] since it is a rotation of 
the trivial lattice IX. Moreover the paper [19] suggests that it may be easier to 
reduce hypercubic lattices than general lattices of a similar dimension. 

This lattice, expressed via the Gram matrix G, may be reduced with LLL, 
or one of its variants, producing a unimodular matrix V such that V^GV is the 
more reduced lattice. If very successful (perhaps because of the easier problem 
the hypercubic lattices pose) the lattice reduction might reduce the Gram ma- 
trix right down to the Identity matrix (e.g V = U~^ would). In such a case, 
V*U^UV = Id, implies that W = UV is a signed permutation matrix, and mul- 
tiplying the matrix FU by V yields FW, whose columns are all signed rotations 
of the sought vector, /. 
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Thus, provided that the lattice reduction algorithm manages to fully reduce 
G to the identity, / can be recovered. However, the variants of LLL, which are 
guaranteed to produce a shortest vector, are exponential time algorithms. In 
practice, LLL, and its higher block-size variants [17] often produce much shorter 
vectors than the available bounds. 

The approach taken in the more complicated algorithm in [9] provides a 
strategy that will guarantee that we find the shortest vector in polynomial time, 
even though LLL only returns a multiple of the shortest vector in polynomial 
time. The technical trick is to attempt to reduce the ideal F^~^, for a prime R 
congruent to 1 modp, in such a way that an element f^~^a is produced where 
a has small L 2 norm. The congruence f^~^a = a mod R can be used to find a 
when it is small compared to R. A final calculation finds / from f^~^. 

6 General Case 

The assumptions made in the previous section, were convenient but not essential. 
We now explain how to extend the algorithm to the general case, where we do 
not assume that p = 3 mod 4 or that the norm does not have repeated prime 
factors. 

We suppose that the cyclotomic norm, Nf, factors as and wish to 

determine a list of candidate ideals F for (/). 

Fix a prime q. Then the factorization of q into prime ideals in Q(Cp) may 
be efficiently computed by factoring the polynomial 1 + X + X'^~^ 

over Fq using either the Berlekamp or Cantor-Zassenhaus algorithms (see [3], 
section 3.4 for more details). (The number of prime ideals above q depends on 
the order of q modp.). For each prime ideal Q above q, the exact exponent of 
Q dividing (/ * /) may be computed via ideal divisibility tests. Thus, given the 
factorization of the norm TV/, the factorization of (/ * /) into ideals may be 
efficiently computed. 

Now consider a prime ideal Q, and let r be the largest power of Q dividing 
g. For each Q, let Q be the complex conjugate ideal. In the case that Q = Q, 
the power of Q dividing (/), is clearly r/2. When Q ^ Q, then we know that 
the maximal power of Q, ag say, that divides (/) must be in {0, . . . ,r}, and 
moreover ag + ag = r. This leaves r -|- 1 possible candidates to enumerate over. 

Considering all prime ideal pairs Qi yf Qi such that Q'’*|(/ * /), we see that 
the total number of candidate ideals for {F) is IK’"* + !)• 

Once the list of candidate ideals F has been established, the algorithm may 
proceed as described above. The only difference is that one does not have the 
simplicity and efficiency gain obtained by working in the quadratic imaginary 
subfield. 



7 Experiments 

The norm, Nf, of / down to Z, can be calculated as 1//(1) times the determinant 
of an associated cyclic matrix, and thus can be bounded by \f\P, where | • | 
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denotes the Euclidean norm of the entries of /. In experiments both the number 
of factors and the multiplicity of these factors followed what one would expect 
from the Prime Number Theorem, which leads us to conjecture that the prime 
factorizations of the norms are distributed in a similar fashion to randomly 
sampled numbers with a similar bit length. 

In obtaining practical results, we implemented just the p = 3 mod 4 case in 
PARI. 



7.1 Recovering f When p — 3 mod 4 

For small values of p, congruent to 3 mod 4, the entire algorithm, as described 
above was implemented in PARI [II]. This symbolic and computational calcu- 
lator has the advantage of being able to apply the LLL algorithm to a lattice 
expressed via a Gram matrix, which the (otherwise flexible) NTL [18] LLL im- 
plementation does not allow for. 

The arithmetic in Q{^/—p) was used to create candidate ideals. Our exper- 
iment also implemented the simplified version of the Gentry-Szydlo algorithm 
described above, which was sufficient for the primes tested. That is, the original 
LLL algorithm was able to reduce the Gram matrix to the identity, in the cases 
that yielded a solution. 

Result were (quickly) obtained with p varying from 19 to 71, with the for- 
tunate outcome that for these smallish primes, the simplified Gentry-Szydlo 
approach was always sufficient to completely reduce the lattice^. As expected, 
the results reflected the fact that not all candidate ideals must yield a solution 
for /, e.g. there were many instances where the norm had exactly two prime 
factors, and only two of the four candidates produced a valid /. 



8 Conclusions 

We have shown how the difficulty of factoring f * f into elements reduces to the 
problem of factoring an integer Nf, and the problem of applying the (polynomial 
time) Gentry-Szydlo algorithm to a number of candidates. Under the assumption 
that the factorization of the norms Nf are distributed in a similar fashion to 
average numbers of a similar size, then the bottleneck of the algorithm occurs 
in the integer factorization stage (since the number of candidates is exponential 
in the number of prime divisors of Nf but this is logarithmic in p) . 

In particular this paper shows that / such that |/| is small, where | • | denotes 
the Euclidean norm of the entries of /, are particularly weak, since the integer 
Nf is particularly small in these cases. 

We remark that if one is trying to build a cryptosystem on the basis of the 
hardness of factoring /*/, then p should be chosen such that it is highly unlikely 
that the norm Nf can be factored in a reasonable time. 

^ This is consistent with the conjecture that hypercubic lattice may indeed be an easier 
class of lattice to reduce. 
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Abstract. It is well-known that the minus class number hp of an imag- 
inary cyclic quartic field of prime conductor p can grow arbitrarily large, 
but until now no one has been able to exhibit an example for which 
hp > p. In an attempt to find such an example, we have tabulated hp for 
all primes p = 5 (mod 8) with p < 10^° and for primes p < 10^^ satisfying 
certain quartic character restrictions. An analysis of these data yields un- 
conditional numerical evidence in support of the Cohen-Martinet heuris- 
tics, but as we did not find a value of p for which hp > p hj these 
methods, we constructed a 77-digit value of p for which one can prove 
hp > p assuming the Extended Riemann Hypothesis. 

1 Introduction 

Let p be a prime and let be a primitive pth root of unity. Let N denote the 
maximal imaginary subfield of degree d a power of 2 of the cyclotomic field Q(Cp) 
and let A+ denote the real quadratic subfield of degree d/2 of N. The minus 
( relative ) class number hjj of N is given by 



— dj\[ /hj^+ 

where and hpf+ are the class numbers of N and N~^ , respectively. For exam- 
ple, if p = — 1 (mod 4), then Np := N = Q(\/^), A+ = Q, and h~ := is 
the class number of Q(y^— p). In this case we always have h~ < p. 

If p = 5 (mod 8), there exist integers a and b such that p = J- 6^, a = 
— I (mod 4), b = 2 (mod 4), and ab = 2 (mod 8). (These conditions suffice to 
determine a and b uniquely.) In this case 

Np-.= N = q(^^-{p + b^p)y N+ :=N+ =q{^) , 

and if h~ is the minus class number of Np, then h~ is not necessarily less than 
p. In fact, for any c > 0 it can be shown that there exists an inhnitude of values 
of p such that h~ > cp. In the sequel we will conhne our attention to h~ when 
p = 5 (mod 8). 

D.A. Buell (Ed.): ANTS 2004, LNCS 3076, pp. 280-292, 2004. 

© Springer- Verlag Berlin Heidelberg 2004 
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Let Xp denote the only quartic Dirichlet character of conductor p such that 
Xp(2) = i + 1 = 0). If, as usual, we denote the Dirichlet L-function as 



L{s,Xp) 



Xp{n) 



then 






( 1 ) 



Louboutin [5] has pointed out that we may write L{0,Xp) as 



7T 






dx 



/nv'TT/p 



(2) 



where 



WiXp) 



t(Xp) 

Wp 



and t{xp) is the Gauss sum 



p-i 

j=i 

Also, t{xp) can be evaluated as e^ap where 



p + ay/p 

2 



>1 



p- ay/p 
2 



and 6p is either 1 or —1. 

Louboutin used the fact the L (0, Xp) G ^[*] to develop a method of using only 
a certain number of the earliest terms of (2) to approximate L (0, Xp) sufhciently 
accurately to be able to deduce its exact value. By using this technique he was 
able to compute unconditionally all the values of h/ for p < 10^. Furthermore, by 
restricting the value of p such that Xpil) = 1 for g G {3, 5, 7, 11, 13, 17, 19, 23, 29}, 
he was able to use his method to discover that if p = 1679516029, then h/ = 
904595821 > p/2 {h/ /p « 0.5386), but he was unable to say whether this was 
the least such p for which h~ > pj2. Later in [6], he showed how his technique 
for computing h/ could be improved, but did not describe any implementations 
of his new ideas. 

The purpose of this paper is to describe and use a further modification of 
Louboutin’s later procedure to find all the values of h/ for p < 10^°. On im- 
plementing this technique, we were able to establish that the least p such that 
h/ /p > 1/2 is p = 599630509. We also discuss our attempt to find values of 
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p for which we get larger values of the ratio hp /p by prescribing the quartic 
character Xp(g) for the first several odd primes q. By doing this we found that 
a p = 34131117018541, then h~ = 23754887413613 and h~ /p « 0.6960, the 
largest value of this ratio that we were able to find unconditionally. Finally, as 
we had been challenged by Louboutin to find a p for which h~ /p > 1, we exhibit 
such a prime and describe our procedure for constructing it and verifying that 
hp /p > 1- However, we must emphasize that this result is contingent on the 
truth of the extended Riemann hypothesis (ERH). 



2 Tabulation of for p < 10^° 



As indicated earlier, Louboutin provided an improved version of his earlier algo- 
rithm for evaluating h~ in [5]. This method makes use of the following special 
case of his more general Theorem 4. 



Theorem 1. Let M > 1 be given and let m 
Lm(0,Xp) by 




. If we define 



Lm{0,Xp) = Vp'ZI ^ '^{en + en+i)Sn{Xp) , ( 3 ) 

where 

n 

SniXp) = X! = exp(-n^7r/p) , 



then 



1-^(0, Xp) -^m(0,Xp)I < 




3 

^ ■ 



Since \L (0, Xp)| G Z, we see that if p > 5 and M is sufficiently large that 



3 3^1 

2vOT ^ 2 



(i.e., M > 4.57) , 



then we can compute h~ by using only terms in Lm{0,Xp)- This, 

of course, is predicated on our being able to evaluate W{xp) or, equivalently, 
determining Cp. Louboutin (unpublished) developed a method for doing this, 
but it is quite slow and unwieldy. Indeed, as noted by Berndt and Evans [2], 
there is no known efficient technique for computing Cp. However, as there are 
only two possible values for Cp, we simply evaluated Lm(0,Xp) for both and 
selected that one which is closest to a Gaussian integer, after having checked 
that the other one is not close to any Gaussian integer, i.e. does not satisfy 
|Lm(0,Xp) ~ ^[*]| < 3/2\/7rM -|- 3/8^. Any possible ambiguity can be re- 
solved by increasing M. For example, if p = 17333, then selecting M such that 
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2,/2^/ttM + 3/8^^ < 0.01 yields an approximation for which the two possible 
values of Lm(0,Xp), -26.0021066 - 35.998797H and 32.9999231 + 38.9996724i, 
have distances 0.002425 and 0.0003365 from the nearest Gaussian integer, both of 
which are less than 0.01. Using 0.001 for the error bound separates the two cases 
successfully, as we get —26.0021379 — 35. 9985175t and 33.0000346 + 38.99994191, 
and only the latter has distance 0.00006760 less than 0.001 from the nearest 
Gaussian integer (the former has distance 0.002601). Gomputationally, this strat- 
egy worked very well — an error bound of 0.01 worked in almost all cases. 

Louboutin also mentioned that we can lessen the work needed to eval- 
uate e„ by using the recursive formulas e„+i = /„e„, /„+i = li/„, where 
h = exp(— 27 t/p), /o = exp(— 7 t/p), cq = 1. Thus, any e„ can be computed 
by performing two multiplications. Note that /„ = exp(— 7r(2n + l)/p). 

There remains the problem of evaluating Xp(n). It is easy to see that 

Xpia) = , ( 4 ) 



where 

\{a) = min |a > 0 | (mod p)| € {0, 1, 2, 3} 

and tp is defined as tp = (mod p). However, for large values of n (n > 

^yp), the determination of all the characters needed in (3) can be very time- 
consuming. Thus, we found it convenient to make use of an idea in Stein and 
Williams [8]. We put r _2 = n, r_i = p and execute the Euclidean algorithm to 
obtain the continued fraction expansion of n/p. On doing this we get 

G-2 = 9jG-i+G (0<G<G-1) J = 0: 1.2, ■ ■ ■ ) • 

Putting A -2 = 0, A-i = 1, B -2 = 1, B-i = 0 and 

^j+i = Qj+i^j + Bj+i = Qj+iBj + Bj-i (j = — 1 , 0, 1 , 2, . . . ) , 

we get 



nBj -pAj = {-lyvj . 



If we find that k such that 



Bk < Lv^J < Bk+i , 



then 



Tk = \nBk -pAk\ < y/p ■ 



Also, since 



BkU = {-lyrk (mod p) , 
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Table 1. ft /p hichamps 



p 


h 


h~/p 


P 


h 


h~/p 


13 


1 


0.07692 


1908421 


667025 


0.34952 


109 


17 


0.15596 


2405701 


891545 


0.37060 


757 


125 


0.16513 


3521389 


1315845 


0.37367 


829 


145 


0.17491 


4975909 


1860241 


0.37385 


1621 


333 


0.20543 


5020261 


1936701 


0.38578 


7669 


1585 


0.20668 


6056989 


2342125 


0.38668 


12301 


2825 


0.22966 


6389629 


2765413 


0.43280 


24469 


6029 


0.24639 


14283229 


6447985 


0.45144 


29989 


8325 


0.27760 


30903469 


13975937 


0.45224 


101581 


29753 


0.29290 


89599381 


41338865 


0.46137 


126949 


39593 


0.31188 


125105821 


57930237 


0.46305 


199021 


62145 


0.31225 


169614589 


78807397 


0.46463 


410029 


135533 


0.33054 


182338381 


88134569 


0.48336 


578029 


198725 


0.34380 


599630509 


304391965 


0.50763 


661621 


227869 


0.34441 


1679516029 


904595821 


0.53861 


1039021 


362389 


0.34878 


6033132109 


3381836985 


0.56054 



we can evaluate Xp(n) as 

Xp{n) = Xp{-^)’'Xpirk)XpiBk)~^ = {-^)'"Xp{rk)Xp{Bk)~^ ■ 

Thus, if we tabulate Xp{^) for all 1 < x < [i/pj , we can compute Xpi'^^) in no 
more than two table look-ups after evaluating Vk and Bk via a partial Euclidean 
algorithm. This is faster than using (4) to evaluate Xp(n) when n > y/p, and 
since m > y/p, increases the speed with which we can evaluate Lm{0,Xp)- In 
our implementation, the modified character evaluation was from 1.4 to 1.9 times 
faster, depending upon the size of p. 

We used this method, implemented using the NTL C-|— I- number theory 
library [7], to tabulate values of h~ for the 113764515 primes p < 10^°, 
p = 5 (mod 8). The computation took 1564 days of CPU time on 269 2.4 GHz 
Xeon processors running Linux, i.e., about 6 days of real time or approximately 
1.2 seconds per field. We used 104 bits of precision for the floating-point approx- 
imations, and found that this sufficed in all cases. 

The h~ /p hichamps for these primes are listed in Table 1 and the lochamps 
in Table 2. Our computation shows that p = 599630509 is the smallest prime 
such that h~ /p > 1/2. The largest ratio h~ /p found by this method, 0.56054, is 
larger than that found by Louboutin [5] (0.5386), but still unfortunately quite 
far from 1. The slow growth- rate of h~ /p leads us to believe that this approach 
is unlikely to produce such a p without further significant algorithmic advances. 



2.1 The Cohen-Martinet Heuristics 

Although our tabulation was unsuccessful in finding p with h~ > p, the data are 
still useful for providing evidence in support of the Cohen-Martinet heuristics 
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Table 2 . h /p lochamps 



p 


h 


h~lp 


P 


h 


hr Ip 


13 


1 


0.0769231 


39829 


289 


0.0072560 


29 


1 


0.0344828 


226669 


1429 


0.0063043 


37 


1 


0.0270270 


986749 


5365 


0.0054370 


53 


1 


0.0188679 


13136509 


70669 


0.0053796 


61 


1 


0.0163934 


27733861 


142225 


0.0051282 


349 


5 


0.0143266 


63449149 


313909 


0.0049474 


373 


5 


0.0134048 


181584901 


814333 


0.0044846 


1789 


13 


0.0072666 


3090045781 


13284769 


0.0042992 



[3]. Among other things, these heuristics provide predictions on the probability 
that an odd integer I divides h~ . For an integer p > 2 and a an integer or oo, 
set {p)a = ni<fc<a(l (p)o = 1- Then the probability that an odd integer 

I divides h~ is estimated to be Pi{l)Ps{l), where 



Pi{l) 

Psil) 



\\l \ 0 </ 3 + 7 <ck 

p=i (mod 4) 

n (1 - (p^)oo/(p^)[(a-i)/2]) 

II I 

p=3 (mod 4) 



-1 



([x] denotes the nearest integer to x). For example, prob(3|/i“) « 0.123440, 
prob(5 I hp ) « 0.421894, and prob(7 | /i“ ) « 0.020825. 

If we assume that the subset of imaginary cyclic quartic fields of prime con- 
ductor behaves the same as all such helds, then our data are especially well-suited 
for this purpose, as the minus class numbers computed with Louboutin’s method 
are unconditionally correct. A similar assumption, restricting to prime discrimi- 
nants, has proved to be reasonable in other contexts, in particular real quadratic 
fields [9]. 

Let ri{x) denote the fraction of p < x for which l\h~ , i.e.. 



ri{x) 



\{p < X : l\hp , p prime, p = 5 (mod 8)}| 
\{p < X : p prime, p = 5 (mod 8)}| 



and let qi{x) be given by 



/ N _ i"i{x) 

Pi{l)P3{l) ’ 

the ratio between n(x) and the predicted probability that l\h~. If the Cohen- 
Martinet heuristics are accurate, we would expect qi{x) to approach 1 as a; 



increases. 
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Table 3. qi{x) values 





1 1 


X 


3 


5 


7 


9 


11 


13 


15 


27 


10000000 


1.00239 


0.99408 


0.99245 


1.00239 


0.99066 


1.00492 


0.99462 


0.99038 


20000000 


0.99904 


0.99466 


0.98476 


0.99904 


1.01804 


0.99902 


0.98544 


0.97838 


30000000 


0.99994 


0.99614 


0.98937 


0.99994 


1.00453 


0.99843 


0.99157 


0.98537 


40000000 


1.00196 


0.99538 


0.99411 


1.00196 


1.00864 


0.99963 


0.99403 


0.99860 


50000000 


1.00214 


0.99643 


1.00067 


1.00214 


1.01580 


0.99970 


0.99510 


0.99172 


60000000 


1.00346 


0.99650 


1.00024 


1.00346 


1.01078 


1.00002 


0.99721 


0.99857 


70000000 


1.00516 


0.99719 


1.00089 


1.00516 


1.00961 


1.00021 


1.00026 


0.99993 


80000000 


1.00360 


0.99682 


1.00028 


1.00360 


1.00739 


0.99948 


0.99881 


0.99829 


90000000 


1.00317 


0.99687 


1.00309 


1.00317 


1.00497 


0.99915 


0.99832 


1.00153 


100000000 


1.00231 


0.99649 


1.00085 


1.00231 


1.00120 


0.99872 


0.99673 


0.99742 


200000000 


1.00167 


0.99785 


0.99999 


1.00167 


1.00099 


0.99876 


0.99979 


1.00068 


300000000 


1.00058 


0.99832 


0.99866 


1.00058 


0.99966 


0.99923 


0.99896 


1.00051 


400000000 


0.99919 


0.99894 


0.99847 


0.99919 


1.00174 


1.00020 


0.99884 


0.99754 


500000000 


0.99903 


0.99892 


0.99901 


0.99903 


0.99808 


1.00083 


0.99884 


0.99611 


600000000 


0.99905 


0.99900 


0.99877 


0.99905 


0.99782 


1.00067 


0.99832 


0.99512 


700000000 


0.99883 


0.99905 


0.99880 


0.99883 


0.99840 


1.00060 


0.99841 


0.99430 


800000000 


0.99870 


0.99918 


0.99988 


0.99870 


0.99931 


1.00031 


0.99802 


0.99578 


900000000 


0.99912 


0.99913 


0.99990 


0.99912 


0.99758 


1.00018 


0.99871 


0.99515 


1000000000 


0.99884 


0.99919 


0.99970 


0.99884 


0.99769 


1.00024 


0.99832 


0.99642 


2000000000 


0.99868 


0.99962 


1.00136 


0.99868 


1.00041 


1.00035 


0.99799 


0.99782 


3000000000 


0.99929 


0.99966 


1.00136 


0.99929 


1.00010 


1.00036 


0.99925 


0.99810 


4000000000 


0.99969 


0.99968 


1.00094 


0.99969 


0.99968 


1.00031 


0.99960 


0.99906 


5000000000 


0.99942 


0.99966 


0.99985 


0.99942 


0.99777 


1.00022 


0.99902 


0.99834 


6000000000 


0.99962 


0.99969 


0.99987 


0.99962 


0.99782 


1.00037 


0.99933 


0.99871 


7000000000 


0.99964 


0.99966 


0.99974 


0.99964 


0.99767 


1.00034 


0.99949 


0.99873 


8000000000 


0.99973 


0.99969 


0.99991 


0.99973 


0.99772 


1.00025 


0.99954 


0.99880 


9000000000 


0.99976 


0.99971 


1.00006 


0.99976 


0.99743 


1.00004 


0.99964 


0.99897 


10000000000 


0.99980 


0.99983 


1.00015 


0.99980 


0.99768 


1.00004 


0.99983 


0.99932 



In Table 3 we give values of qi{x) for several small values of I, and in Figure 1 
and Figure 2 we plot the qs{x) and qsix) values, respectively. As these values do 
appear to approach 1 (this is especially clear in the graphs), our data strongly 
suggest that the Cohen-Martinet heuristics are indeed accurate in this particular 
case. 

3 Further Tabulation 

As it was our intention to find some p for which h~ > p, the results of the 
computer run described in the previous section were somewhat disappointing. We 
therefore decided to modify and extend Louboutin’s earlier idea of prescribing 
the quartic character of the primes for which we could evaluate h~ . Instead of 
using only the first 9 primes, we prescribed p as follows: 

• Xp(<z) = 1 for 3 < <7 < 31 (the first 10 odd primes), 
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this maximizes the hrst several terms of the Euler product for \L (l,Xp)| . 

In order to filter out possible prime candidates quickly by recognizing whether 
or not this characterization holds, we made use of a result of Lehmer [4]. If 
p = + 6^ as mentioned in Section I and {q/p) = I (clearly if Xp{o) = then 

{q/p) = I), then 

• Xp{q) = (-1/9) if 9I& 

• Xp{q) = (-2/9) if 9 |a 

• Xp{^) = (— 2A(A + I)/< 7 ) if g /a6 and a = pb (mod q) for any p such that 

p^ = (A^ — 1)“^ (mod q) (A ^ 0, ±1 (mod q)). 

She also showed that there are exactly (g — 4 — 3(— 1/g) — 2(— 2/g)) /4 such val- 
ues of p mod g. 

For our run we 

1. precomputed a table of all possible values of p for the first 20 primes; 

2. precomputed a table of inverses modulo g for each of the first 20 primes. 

For p = -b we tested whether ah~^ = p (mod g) for some p for each g. This 
required two table look-ups and one multiplication modulo g for each of the 20 
values of g. 

Once a value of p had survived this process, we computed h~ by the tech- 
niques of Section 2. This process was used on all primes less than 10^^. Out of 
the 801235712049 primes p = 5 (mod 8) less than 10^^, 59814 had the required 
quartic character. This computation took 3172 days of CPU time on 269 2.4 
GHz Xeon processors running Linux, i.e., about 12 days of real time. 

The h~ /p hichamps from these primes are listed in Table 4. The largest h~ /p 
found by this method is considerably larger than that of the previous section, 
but still quite far from 1. Given the slow growth- rate of h~ /p, extending the 
computation to 10^®, which would require about 4 months with our resources, 
seems unlikely to produce a p with h~ /p > 1. 

Another possible strategy would be to strengthen the quartic residuacity re- 
quirements for the search in order to reduce the number of candidates for which 
h~ has to be computed. Indeed, searching for p values with more restricted quar- 
tic character would likely yield better results, as our best p = 34131117018541, 
has Xp('Z) = 1 for g < 67, the first 18 odd primes. Unfortunately, such an ap- 
proach is unlikely to result in a significantly faster search. The strategy we em- 
ployed filtered out all but 59814 primes (indeed, our filtering parameters were 
selected so that only a small number of primes passed), and each h~ can be 
evaluated in under 2 minutes. Thus, the vast majority of computing time was 
spent on the filtering process, and a further reduction in the number of primes 
p considered will not greatly impact the speed of the search. 

We also attempted to use CASSIE, our most recently completed number 
sieve, to find values for p modulo the product P of all odd primes up to 43 
to 73 and test prime values of 9 -b 6^ and 81 -b 6^, where b = p~^a (mod P), 
6 = 2 (mod 4). However, the results were not as good as those found by our 
previous search. 
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Table 4. h /p hichamps for prescribed p < 10^^ 



p 


a 


b 


h 


h~/p 


926179549 


29643 


-6890 


441494505 


0.47668 


6033132109 


68547 


-36530 


3381836985 


0.56054 


17513700229 


128223 


32750 


10341416269 


0.59048 


178488307141 


269679 


-325210 


106156720801 


0.59475 


611097213301 


78951 


-777730 


366037283093 


0.59898 


977833136149 


965943 


211630 


595915703469 


0.60942 


1636256708629 


779223 


1014430 


1025566748785 


0.62678 


2749263025909 


-212553 


-1644410 


1723720314185 


0.62698 


4233467000701 


1173051 


1690390 


2710099480069 


0.64016 


6514175296381 


2512059 


-451370 


4236111745733 


0.65029 


6725488386061 


2584131 


-218530 


4401865345149 


0.65450 


18989849070949 


4332207 


470990 


12549857707309 


0.66087 


32956902209221 


2230239 


-5289890 


21797169170461 


0.66138 


34131117018541 


-4811229 


-3314090 


23754887413613 


0.69599 



4 Constructing a Suitable p 

As our search to this point did not reveal any p such that h~ > p, we attempted 
to construct such a value of p by using a process similar to that employed in a dif- 
ferent context by Teske and Williams [10]. We first computed an approximation 
to |A (l,Xp)|^ by computing 




(We assume Xp(q) = 1 for all primes q < Q.) We found that if Q = 257, then this 
quantity exceeds thus, since the tail of the modulus of the Euler product is 
likely to be near 1, by (1) it is reasonable to hope that h~ > p for p such that 
Xpil) = 1 for all odd primes q < 257. 

To hnd such values of p we first computed 

1. Qi, the product of all the primes in the set 

Qi = {5, 37, 53, 61, 101, 109, 113, 137, 149, 157, 173, 181, 193, 

197,229,233,241} . 

Note that each of the primes in Qi is congruent to 1 mod 4. 

2. Q 2 , the product of all the primes in the set 

Q 2 = {3, 107, 131, 139, 163, 179, 211, 227, 251} . 



Each of the primes in Q 2 is congruent to 3 mod 8. 
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Now if p is a prime and p = , then by Lehmer’s results in Section 3 

we have Xp(q) = 1 for g | QiQ 2 - For the remaining primes q < 251 in the set 



Qs = {7, 11, 13, 17, 19, 23, 29, 31, 41, 43, 47, 59, 67, 71, 73, 79, 83, 89, 97, 103, 

127, 151, 167, 191, 199, 223, 239}, 

we found X such that X = 1 (mod 2) and X = 2QiQ^^ p (mod q), where 

= (A^ — 1)“^ (mod q) and (— 2A(A + l)/q) = 1. Again, we used CASSIE to 
produce suitable values of X modulo 2Q^ where Q 3 is the product of all primes 
in Q 3 . When a prime value of Q^X^ +4Qf is produced, we must have Xp(<z) = 1 
for all odd primes q < 251. 

By selecting Qi, Q 2 , and Q 3 in this manner, Q^X^ and 4Qf will be ap- 
proximately the same size, and the values of p produced will be almost as small 
as possible using this method. Although we’re only forcing Xp(q) = 1 for odd 
primes q < 251 as opposed to 257, we expect that some solutions p will still have 
Xpio) = 1 for a few values of g > 251 as well, and will thus have h~ > p. 

The difhculty with this approach is that the resulting values for p are much 
too large for h~ to be evaluated by our techniques in Section 2. However, we 
can estimate h~ /p by using Bach’s [1] technique. We made use of the slightly 
modified version of this process described in te Riele and Williams [9]. 

If S{x) = + i)log(x -b j), B{x,Xp) = J2q<xQ/iQ - Xp{q)), aj = 

(x + j) log(x + j)/5'(x), then under the ERH 



T-l 



logL(l,Xp) - Y,aj\ogB{T + j,Xp) 
1=0 



<A{T,p) , 



where 



A{T,p)=c{p)G{T)+H{T) , 

c{p) = 2/3(logp -b 5/3) and G{x), H{x) are defined in [9]. 
If we put 

T-l 

S{T,p) = ^ aiRe(logH(T -b j,Xp)) 
where 

w{q) = < hr-i 

y^3=q-T+i “1 

then 



q<2T-l 



ior q <T 

for T < g < 2T - 1 , 



q 

q - Xp{q) 



\\og\L{l,Xp)\-S{T,p)\<A{T,p) . 
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Hence 



^og\L{l,Xp)\> S{T,p) - A{T,p) . (5) 

To search for a suitable p, we used CASSIE in conjunction with the Linux 
cluster described in Section 2 to produce prime values oi p = + 4Qi 

as described above. After about 33 hours of computing time, 174 values of X 
satisfying all the congruence conditions for odd primes q < 239 were produced, 
of which 4 yielded prime values of p = + 4Qf . We could not prove that 

h~ > p for the first three primes, but the fourth prime, the 77-digit 

p= 167766914685735327386705473398333368155787798062989167405312548 

75872803041429 . 

obtained using X = 63174133262220373797 has Xpil) = 1 for all odd primes q < 
251 and did yield the desired result. Using Bach’s method with T = 11030000, 
we get 

S{T,p) > 1.50976730, A(T,p) < 0.01499840 . 

This computation took 2 minutes, 13 seconds carrying 53 bits of precision. In 
order to ensure that our S{T,p) values have sufhcient numerical accuracy, we 
repeated the computation with 300 bits of precision (about 1 hour). These results 
agreed with the first computation to 12 decimal digits. Thus 

log|T(l,Xp)| > S{T,p) - A{T,p) 

> 1.49476890 

> log(v^) « 1.491303476 , 

which together with (1) and (5) implies that h~ > p under the ERH. 

5 Conclusions 

Finding a smaller such value of p for which h~ > p, ideally one for which we can 
prove h~ > p unconditionally, remains an open problem. Although tabulating 
values of h~ for p > 10^° can be done, the growth rate of h~ /p is so slow that it 
appears to be infeasible to find a suitable p with this method unless a significantly 
faster method for evaluating h~ is developed. Tabulating h~ for primes with 
restricted quartic character and p > 10^^ also seems unlikely to yield a suitable 
p value for reasons outlined above. Constructing special values of p does work, 
but we do not know how to modify this method to find sufficiently small values 
of p for which Louboutin’s algorithm is computationally feasible. One possibility 
would be to sieve for values of p directly rather than via the special construction 
we have employed, but it is unknown how to sieve for pseudoquartics in this 
manner efficiently. Thus, without some new results on either evaluating h~ or 
rapidly generating pseudoquartics, finding the smallest p with h~ > p, or even 
any p for which h~ > p can be proved unconditionally, is currently beyond our 
capabilities. 
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Abstract. We compute all nonic extensions of Qg and find that there 
are 795 of them up to isomorphism. We describe how to compute the 
associated Galois group of such a field, and also the slopes measuring wild 
ramification. We present summarizing tables and a sample application 
to number fields. 



1 Introduction 

This paper is one of three accompanying our online database of low degree p- 
adic fields, located at http://math.la.asu.edu/~jj/localfields/. The first 
paper, [10], describes the database in general. There are two cases enormously 
more complicated than all the others in the range considered, octic 2 -adic fields 
and nonic 3-adic fields. The paper [9] describes the 1823 octic 2-adic fields and 
this paper describes the 795 nonic 3-adic fields. 

Our online database has an interactive feature which allows one to enter 
an irreducible polynomial f{x) G Z[x] and obtain a thorough analysis of the 
ramification in the corresponding number field K = Q[a;]//(a:). The inclusion 
of octic 2-adic fields and nonic 3-adic fields in the database greatly extends the 
number fields K that can be analyzed mechanically by our programs. Certainly, 
the degree of K can be very much larger than 9. 

Section 2 discusses a standard resolvent construction and then three more 
specialized resolvent constructions for nonic fields with a cubic subfield. Sec- 
tion 3 centers on the Galois theory of nonic fields over general ground fields, 
describing the 34 possibilities for the Galois group associated to a nonic field. 
Also this section gives further information useful for our particular ground field 
Q 3 . For example, 11 of the 34 possible Galois groups can be immediately ruled 
out over Q 3 , because they don’t have appropriate filtration subgroups. Some- 
what coincidentally, the 23 groups that remain are exactly those with a normal 
Sylow 3-subgroup. 

The nonic 3-adic field section of our database would run to some twenty 
printed pages, so here we give only summarizing tables. Section 4 centers on 
Table 4.1 which sorts the 795 fields we find according to discriminant and Galois 

D.A. Buell (Ed.): ANTS 2004, LNCS 3076, pp. 293-308, 2004. 
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group. Of the 23 eligible groups, 22 actually appear. Section 5 describes ramifica- 
tion in nonic 3-adic fields in terms of slopes, with Tables 5.1 and 5.2 summarizing 
our results. Finally, Section 6 gives an application to number fields. 



2 Resolvent Polynomials 

Resolvents play a major role in the computation of Galois groups over Q. Some 
resolvents can be computed quickly using exact arithmetic with resultants. How- 
ever more often one computes resolvents via approximations to complex roots, 
knowing a priori that the resolvents in question have integer coefficients. 

The fields studied here are represented by monic polynomials / G Z[x]. The 
computation of absolute resolvents can then follow standard methods. However, 
most applications of resolvents to computing Galois groups in high degree utilize 
relative resolvents. In the relative case, one somehow gives structure to the roots 
of / to refiect the fact that Gal(/) is known to lie in some proper subgroup G“ 
of Sn- One speaks of resolvents relative to the upper bound group G“. 

A complication for us is that for a given nonic polynomial / G Z[a;], we 
may have Galqp(/) < G“ but Galq(/) ^ G“. In this case, the resolvent will 
have coefficients which are p-adic integers, but generally not rational integers, 
and so the method of computing via complex approximations does not work 
directly. The method [6] of using p-adic approximations rather than complex 
approximations does not help here. It involves choosing a prime p unramified for 
the given extension of Q, whereas we are starting with p-adic extensions which 
are highly ramified. 

For our three relative resolvents, the upper bound group G“ is the wreath 
product S 3 I S 3 , which is the generic Galois group of nonic fields with a cubic 
subfield. Given / G Z[x] which defines a nonic extension of Q3 with a unique 
cubic subfield, we work around the problem described in the preceding paragraph 
by computing / G Z[a;] which defines the same nonic extension of Q3, but where 
/ has a corresponding cubic subfield over Q. Then we use complex roots in the 
resolvent construction applied to /. The computation of / from / is described 
further in the last paragraph of this section. 

We now describe the four resolvent constructions which we will use system- 
atically in the sequel. The first is a standard absolute resolvent. One starts with 
a degree n polynomial f{x) with complex roots «i, ... , a„. The resolvent cor- 
responds to the subgroup S 2 x Sn -2 of S„ and is given by 

/disc(a;) = - (a* - aj)'^) G Z[x], (1) 

i<j 

It can be computed quickly without approximations to roots via the formula 

/disc(a;^) = Resultantj^(/(p), /(a; -k y))/x”. (2) 

In one case, we will also make use of the variant fuiscix) = /disc(a:^), which 
is itself an absolute resolvent for Si x Si x Sn -2 < Sn- In general, we will 
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systematically denote polynomial resolvent constructions by / i— >■ /* for some 
symbol *. We will use similar notation for the same resolvent constructions on 
the level of fields starting in the next section. 

For the remaining three resolvents, we start with an irreducible monic degree 
nine polynomial / G Z[xj such that Q[x]/f{x) has a unique cubic subfield. We 
choose g{y) G Z[y] so that Q[y]/g{y) is isomorphic to this subfield. Let 

2 2 

h{y,x) = x^ + '^'^Ckiy''x^ (3) 

k=0 £=0 

be a cubic factor of f{x) over Q[y]/g{y)- Let / 3 i,/ 32,/33 be the complex roots of 
g{y). For i = 1, 2, 3, let ai ^2 and be the complex roots of h{Pi, x). Then 
we can recover 



3 3 

fix) = 

i=i j=i 



A formula bypassing the atj is 

f{x) = Resulta,nty{g{y),h{y,x)). 
The three resolvents are 



i=io-eS3 ^ 

3 3 3 

f27(x ) = nn n (x — {ai^i + a2j + , 

f 36 ix)= n n 

o-eSa tGSs \ i—1 / 



fisix ) = n n 




( 4 ) 

( 5 ) 

(6) 

( 7 ) 

(8) 



Now we return to describing how we adjust a given polynomial / to a bet- 
ter one /. We first find a global model Q[y]/g{y) for the cubic subfield of the 
extension Qa[a;]//(x) of Q 3 ; this is easy using the database [10]. Then, we use 
Algorithm 3.6.4 from [2] to reduce the factorization of f{x) over Q 3 [y]/g(y) 
to a factorization problem over Q 3 . We loosely approximate a cubic factor 
h{y,x) G Zp[x] that we obtain by a polynomial h{y,x) G Z[y,x], and use Equa- 
tion (5) to compute a candidate for /. Finally, we test if our candidate / defines 
the same nonic field over Q 3 as / by using Panayi’s p-adic root finding algorithm 
[13]. If it doesn’t, we repeat with a better approximation to h{y,x). Since we 
ultimately use complex roots in computing resolvents /is, / 27 , and f^Q, we aim 
throughout to keep the coefficients of / relatively small. 
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3 Galois Theory of Nonics 

We will set things up over a general base field F, and specialize when necessary to 
our case of interest, F = Q3. We work in the context of abstract separable fields 
of finite degree over F. We say a nonic field K is multiply imprimitive, uniquely 
imprimitive, or primitive, iff it has > 2, 1, or 0 cubic subfields respectively. 

To bring in Galois theory, we imagine that a separable closure F of F is given. 
The number of subfields of F isomorphic to a separable degree n extension K/F 
is n/| Aut(F)|. We associate to K is the n-element set X of homomorphisms 
a : K ^ F. Let be the subfield of F generated by the images of these n 
homomorphisms. Then we call G = Gal(FS^YF) the Galois group of K with 
respect to the fixed separable closure of F. So G is a transitive subgroup of 
the symmetric group Sx and a quotient group of the absolute Galois group 
Gal(F/F). Occasionally we will use this notation when F is a separable algebra 
which is only a product of fields. Then G is no longer a transitive subgroup of 
Sx as indeed its minimal orbits correspond to the factor fields of K. 

In the case F = Q, one can take Q C C as a separable closure. However in 
other cases, like our case F = Q3, there is no simple choice of F and in practice 
one must work with objects which are independent of the choice of F. We will 
therefore consider the Galois group of F to be a subgroup of S„ which is only 
defined up to conjugation. 

There are 34 transitive subgroups of Sg up to conjugation, 4 corresponding 
to multiply primitive fields, 19 to uniquely imprimitive fields, and 11 to primitive 
fields. So, given a nonic field F, one wants first to identify its Galois group among 
the 34 possibilities. The literature contains several accounts of computing Galois 
groups, with [8] being a recent survey. Some of these accounts pay particular 
attention to nonics [7,4]. The approach presented here is tailored to 3-adic fields, 
where it is easier to compute subfields and automorphism groups than it is 
to work with many different relative resolvents and/or large degree absolute 
resolvents. 

Twenty-three of the thirty-four groups have just one Sylow 3-subgroup, while 
the remaining seven solvable groups have four Sylow 3-subgroups. The twenty- 
three groups will be particularly important for us and a partial inclusion diagram 
for just these groups is given in Figure 1. We use the T-notation of [1] to indicate 
the possible Galois groups, with T standing for transitive. This F-notation will 
be our main notation in the sequel as well. However in Tables 3.1, 3.2, and 3.3 
we will also present a more descriptive notation based on [5] . 

In Figure 1, a line from down to Tj means there are subgroups G^, Gj C 
of type Ti, Tj respectively, with Gi C Gj and |Gi| = |Gj|/2. The text in Figure 1 
briefly indicates how some of the phenomena presented in Tables 3.1, 3.2, and 
3.3 relate to index two inclusions. 

Over F = Q3, some of the groups can be easily ruled out. In general, let 
F be a p-adic field, meaning a finite extension of either Qp of Fp((t)). Then a 
Galois extension F^^'/F has a filtration by Galois subfields. 
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9 



1 



18 



3 



Cg and Cg.C 2 




324 



24 

Associated 
to triplets 



2 




8 9 



16 15 14 



19 

Degenerate 
(2, 4, 5, 8) 
and primitive 
(9,14,15,16,19) 







Fig. 1. Nonic gronps having a normal Sylow 3-subgroup and their index two inclusions 



with xsai," being the maximal unramified subextension and being the 

maximal tamely ramified subextension. The group Gal(Ar®'^*’“/F) is necessarily 
cyclic. Similarly, Gal(A'®^*’‘/^®^*’“) is cyclic; moreover, this subquotient has 
order prime to p. Finally Gal(Ar®®'YiF®*^*’*) is a p-group. Eleven of the thirty-four 
candidate Galois groups fail to have a corresponding chain of normal subgroups. 
Five of these excluded groups are uniquely imprimitive and six are primitive. 
They are presented with a dash in the last column in Table 3.2 and Table 3.3 
respectively. The twenty-three groups which do have a corresponding chain are 
exactly those in Figure 1. 

There are different approaches for identifying / F) . We begin by 

computing three quantities directly associated to K, the cubic subfields of K, 
the automorphism group of K, and the parity of K. There can be one, two, or 
four cubic subfields, and the possible Galois groups for a cubic subfield are C3 
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and S 3 . The automorphism group Aut (itT) has nine elements exactly when K 
is itself Galois, in which case Aut(AT) = / F). Otherwise | Aut(AT)| is 1 

or 3 while | Gal{K^^^ / F)\ > 9. In our case of F = Q3, we used Panayi’s root 
finding algorithm [13] for the computation of both subfields and automorphism 
groups. Parities are easier. By definition K = Q[x]/f{x) has parity e = + if 
the polynomial discriminant of / is a square in F and e = — otherwise. In the 
former case Gal{K^‘^^ / F) is in Ag and in the latter case it is not. 

Tables 3.1, 3.2, and 3.3 have columns corresponding to the objects just dis- 
cussed. A blank for |Aut(Ar)| signifies that K has only the identity automor- 
phism. Tables 3.2 and 3.3 also present information related to resolvents. A re- 
solvent construction /!—>■/* on the level of polynomials induces a well-defined 
resolvent construction K 1— >■ AT* on the level of algebras, where K = F\x\/ f{x) 
and f{x) is chosen so that /*(a;) is separable, in which case AT* = F[x]/ ft.{x). 
Finally, the column headed by ^ previews the next section by giving the number 
of nonic 3-adic fields with the given Galois group. If the Galois group is ruled 
out by the lack of an unramified/tame/wild filtration, we print a dash rather 
than a 0, as mentioned before. 

3.1 Multiply Imprimitive Fields 

Suppose AT is a nonic field with more than one cubic subfield. Then if K\ and 
K 2 are any two distinct cubic subfields, AT = ATi 0 ATg. Table 3.1 gives the four 
possible Galois groups. In two cases, there are two cubic subfields, and in two 
cases there are four cubic subfields, as indicated. In this category, no resolvents 
are necessary for distinguishing the Galois groups 



Table 3.1. Nonic groups corresponding to nonic fields with more than one cubic sub- 
field 



G 


Name 


|G| 


|A| 


Subs 


t 


# 


2 


F(9) 


3^ 


9 


cccc 


T 


1 


4 


S 3 X 3 


2^3^ 


3 


sc 


— 


24 


5 


3^ : 2 


2^3^ 




ssss 


-f 


1 


8 


S 3 X S 3 


2^3^ 




ss 


- 


9 



3.2 Uniquely Imprimitive Fields 

It is here that we will use the three specialized resolvent constructions of the 
previous section. They let us canonically construct algebras ATig, A"27, and A'gg of 
the indicated degree from a nonic field AT with a unique cubic subfield. Table 3.2 
gives the nineteen possibilities for the Galois group of a nonic field with a unique 
cubic subfield. 
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Table 3.2. Nonic groups corresponding to nonic fields with exactly one cubic subfield. 
The first fourteen groups have one Sylow 3-subgroup and the last five groups have four 
Sylow 3-subgroups. 



G 


Name 


|G| 


1^1 


Sub 


e 


Kgb Kqx 


K 27 


K27b Kgy 


m 


# 


1 


G(9) 


3^ 


9 


C 


~+ 


Ti 


ci 


If 


Tf 


Ci 




12 


3 


79(9) 


2^3^ 




s 


+ 


Ta 


Si 


rp3 

J-3 


rj-i3 

J-3 


si 




5 


6 


|[3«]3 


33 


3 


c 


+ 


Te 


T 2 


27+ 


27+ 


T 2 




8 


7 


[3^]3 


33 


3 


c 


+ 


T 7 


T 2 


TiTfTf 


27+ 


^3^3 ^3 


4 


4 


10 




2^3® 




s 


+ 


Tio 


n 


27+ 


27+ 


T 4 




49 


11 


E{9) : 6 


2^3® 




s 


+ 


Til 


n 


T 13 I 8 - 


27+ 


G6.18-G3 


2 * 


20 


12 


[O'lS's 


2^3® 


3 


s 


- 


Ti2 


Ts 


rjif rrit t rr^lll 

7 127 12 ^ 12 


27+ 


^3^3 *^3 


4 


36 


13 


77(9) : D% 


2^3® 




c 


- 


Ti3 


n 


Til 18- 


27+ 


G6.18-G3 


2 * 


20 


17 


3(3 


3^ 


3 


c 


+ 


Ti't 


T 7 


27+ 


27+ 


T 7 


3 


36 


18 


E{9) : 79i2 


2^3® 




s 


- 


Ti 8 


Ts 


T(gl 8 _ 


27+ 


Go, 30- S 3 


2 


48 


20 


31 S 3 


2^3^ 


3 


s 


- 


Tio 


Til 


27- 


27+ 


Ti3 


3 


180 


21 


|[3^ :2]S3 


2^3^ 




s 


+ 


TA 


Ti2 


27+ 


27+ 


Ti2 


3 


108 


22 


[3® : 2] 3 


2^3^ 




c 


- 


Ti2 


Ti3 


27- 


27+ 


Til 


3 


60 


24 


[3^ : 2 ] S 3 


2^3^ 




s 


- 


TL 


Tis 


27- 


27+ 


Ti 8 


3 


144 



G 


Name 


|G| 


1^1 


Sub 


e 


Kis 


K27 


K30 


m 


# 


25 




2^3^ 




C 


+ 


18+ 


27+ 


36+ 




— 


28 


S3 13 


2® 3^ 




c 


- 


18+ 


27- 


36+ 




- 


29 


lksi]S 3 


to 

CO 

CO 




s 


— 


18- 


27- 


36- 




— 


30 


Usl]S 3 


to 

CO 

CO 




s 


+ 


18- 


27+ 


36- 




- 


31 


S31S3 


2^3^ 




s 


- 


18- 


27- 


36- 




- 



The resolvents if is, K 27 and are sometimes irreducible. In this case, the 
corresponding slots on Table 3.2 contain 18+, 18_, 27+, 27_, 36+ or 36_. In 
general, we indicate a field of degree > 9 by giving its degree and its parity. We 
indicate a field of degree < 9 by giving its Galois group. 

When G has just one Sylow 3-subgroup there are unique factorizations 

ifi8 = ifgfc X Kqx, (10) 

ifse = K27b X Kgy, (11) 

with factors having the indicated degree and the properties 



Klf = K^l = 

Ki:' = Ki;' C 



(12) 



Here the superscript 3 means that [if*®-' : = 3. Table 3.2 indicates the 

structure of Kgb, Kg^, 7^276, and Kgy. In the Kgy column, Geps- and Gepe- 
are sextic groups of order 18 and 36 respectively which are not contained in Aq. 
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The m heading a column in Table 3.2 stands for multiplicity. In general, we 
say that a fc-tuplet of degree n is a complete list of non-isomorphic degree n fields 
Ki, , Kk with all the same. One speaks of singletons, twins, triplets, 
and quadruplets, for fc = 1, 2, 3, and 4 respectively. The row has a fc in the 
m column iff a nonic field K with Gal{KS^^ / F) = Ti belongs to a /c-tuple. Here, 
1 is indicated by a blank. In most cases, fields in a tuplet share the same Ti. 
The one exception, indicated by a * in Table 3.2, is twins consisting of a Tn 
nonic field and a T 13 nonic field. In fact, the thirty-four Ti fall into thirty-three 
isomorphism classes as abstract groups, the one coincidence being Tn = T 13 . 

Our priming convention in the resolvent columns of Table 3.2 is to distinguish 
different fields with the same Galois group. A Ti in the i row means a nonic 
field isomorphic to the original one. Note that in the instances of twinning, 
one can always pass from a field to its twin via ^^ 27 . In the five instances of 
triplets, it is which lets one pass from a nonic field K = Ka to a second 
triplet ATgt,. Applying this degree eighteen resolvent construction to ATg;, gives 
the remaining triplet Aigc. Finally, applying it to A'gc returns ATg^. Thus any 
collection of nonic triplets comes with a natural cyclic order. Finally, in the two 
instances of quadruplets, one can pass from a given field to the three others via 
Kit- 

Table 3.2 makes clear how one can compute Galois groups of uniquely im- 
primitive nonic fields. The cases of one versus four Sylow 3-subgroups are distin- 
guished by the reducible versus reducibility of ATig or equally well K^q. Within 
the one Sylow 3-subgroup case, |A|, Sub, e, and ATg^ suffice to distinguish groups. 
Within the four Sylow 3-subgroup case, |A|, Sub, and e alone distinguish groups, 
except for T 29 versus T 3 i for which the printed resolvent information doesn’t help. 
In our setting of A = Q3, neither T2g or T3i arises, but one way to distinguish 
them is by the discriminant resolvent. Recall fmsc{x) = fdisc{x^)- In both cases, 
AToisc factors as a degree 18 field times a degree 54 field. The degree 18 field has 
even parity in the case T 2 Q and odd parity in the case T 31 . 



3.3 Primitive Fields 

Neither subfields nor automorphisms can help with determining the Galois group 
of a primitive nonic field. The parity e remains helpful, as shown in Table 3.3. 

Table 3.3 also gives the degrees and parities of the field factors of the degree 
36 resolvent Kdisc- The information presented in Table 3.3 clearly suffices to 
identify Galois groups associated to nonic 3-adic fields, or to even nonic exten- 
sions of general 3-adic bases F where Tis might appear. For general bases, more 
resolvents of higher degrees would be required ([4,7]). 

4 Nonic 3-adic Fields by Discriminant and Galois Group 

In general, let IC{p,n) be the set of isomorphism classes of degree n extension 
fields of Qp. The paper [14] describes how one goes about finding polynomials 
fi{x) G Z[x] such that Qp[x]/ fi{x) runs over IC(j>,n). One key ingredient is the 
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Table 3.3. Nonic groups corresponding to primitive nonic fields 



G 


Name 


|G| 


e 


-^disc 


# 


9 


E{9) : 4 


2^3^ 


T 


18l 


2 


14 


E{9) : Qs 


2®3^ 


+ 


36+ 


4 


15 


E{9) : 8 


2®3^ 


+ 


36- 


0 


16 


E{9) : Ds 


2®3^ 


- 


18- 18'- 


4 


19 


E{9) : 2Ds 


2^3^ 


— 


36- 


20 


23 


E{9) : 2 A 4 


2®3® 


+ 


36+ 


- 


26 


E{9) : 2 S 4 


2^3® 


- 


36- 


- 


27 


PSL{2,8) 


28327 


+ 


36+ 


- 


32 


PEL{2,8) 


28387 


+ 


36+ 


- 


33 


Ag 


2®3®5^7 


+ 


36+ 


- 


34 


Sg 


2'^3®5^7 


- 


36- 


- 



root finding algorithm of [13], mentioned already in both of the previous sections. 
Here, it is used to ensure that Qp[a;]//i(x) and Qp[x]/ fj{x) are not isomorphic 
for i yf j. The other key ingredient is the mass formula of [11,15], as refined in 
[14]. One knows that one has found enough polynomials to cover all of /C(p, n) 
by the use of this formula. In this section, we present Table 4.1 summarizing 
the result of the calculation in the case (p,n) = (3,9), and comment on several 
features of the table. 

Given a nonic 3-adic field K, let itT“ be its maximal subfield unramified over 
Q3. Let / = [iL“ : Q3] and e = 9/f. The original mass formula of [11,15] makes 
it natural to divide nonic fields K into three classes according to e. The unique 
field with e = 1 corresponds to the boldface entry with c = 0 in the row for 
group Ti = Cg. The 41 fields with e = 3 correspond to italicized entries, and the 
remaining 753 are given in ordinary type. 

The refined mass formula of [14] makes it natural to further divide nonic fields 
K according to their discriminant exponent c. This exponent indexes columns 
in Table 4.1. Mass formulae can be used to count the number of subfields of F 
of a given type. Recall from Section 3 that if a field K has automorphism group 
A, then this count will reflect the 9/\A\ isomorphic copies of K in F. Collecting 
these contributions to the mass formula and dividing by the degree 9 gives a 
count of isomorphism classes of fields, each weighted by their mass l/]Hj. The 
sums of these masses for a given (e, c) are given by the refined mass formula in 
[14], and are presented as Me(c) towards the bottom of Table 4.1. Of course, one 
has #e(c) > Me(c) with #e(c) the total number of fields for a given (e,c). 

The lines corresponding to the four multiply imprimitive groups can be con- 
structed directly by tensoring pairs of cubic fields. For example, as runs over 
the four Cg fields and Kg runs over the six Sg fields, ® Kg runs over the 
twenty-four T 4 = Cg x Sg fields. Similarly, the primitive groups Tg and Tiq are 
isomorphic to the sextic transitive groups Cg.C4 and Cg.D4 respectively. In each 
of these two cases, one nonic field K comes from two sextic fields Lfeo, Kqi, with 





302 



J.W. Jones and D.P. Roberts 



Table 4.1. Discriminants and Galois groups of nonic extensions of Q 3 . Fields with 
e = 1, 3, 9 are respectively indicated by bold, italic, and regular type. 



G 


|4| 


0 


9 


10 


12 


13 


15 


16 


18 


19 


20 


21 


22 


23 


24 


25 


26 


# 


2 


9 








1 


























1 


4 


3 




3 




1 




6, 3 


3 




9 
















24 


5 


















1 


















1 


8 










1 




2 




3 


3 
















9 


1 


9 


1 






















9 










12 


3 


















1 








1 








3 


5 


6 


3 








;g 
















6 










8 


7 


3 








1 






3 




















4 


10 
















6 


11 








8 








24 


49 


11 










2 






1 


8 








9 










20 


12 


3 


















9 




27 












36 


13 






S 




1 




S 




3 


3 




9 












20 


17 


3 








9 








9 








18 










36 


18 












2 


4 




3 


12 




18 


9 










48 


20 


3 










6 


12 




9 


45 






27 






81 




180 


21 






















27 








27 




54 


108 


22 






6 




3 




6 










9 


9 


27 








60 


24 
















6 


12 


9 


27 


9 




27 


27 


27 




144 


9 










1 






1 




















2 


14 








1 










3 


















4 


15 




































0 


16 








1 










3 


















4 


19 






2 




2 


2 


6 


2 




6 
















20 


#1 




1 
































1 


Ml 




O.T 


































#3 






10 




20 




11 






















41 


M3 






8.6 




836 




9 
























#9 






2 


2 


6 


10 


30 


22 


66 


96 


54 


72 


96 


54 


54 


108 


81 


753 


Mg 






2 


2 


6 


6 


18 


18 


54 


54 


54 


54 


54 


54 


54 


54 


81 




^ 9,3 










4 


4 


12 


12 


36 


36 


54 
















Mg, 4 














4 


4 


12 


12 




36 


36 


54 










-^9,5 


















6 


6 




18 


18 




54 


54 


81 





^gai _ Otherwise the lines of Table 4.1 cannot be constructed 

from the lower degree tables of [10]. 

Twins, triplets, and quadruplets are visible in varying degrees on Table 4.1. 
In general, for a nonic 3-adic field K with discriminant 3'^, one has c = 2su + 6 sy 
with the slopes < s„ discussed further in the next section. In a tuplet, Sy is 
always constant. For twins, s„ typically varies within a twin pair; however one 
can at least see that the total number of fields for Tu and T 13 is the same and the 
total number for Tig is even. In a triplet, the cubic subfield is constant; if 3'^=“*’ 
is its discriminant then s„ = Csub/2, so c„ and hence c is constant. This explains 
conceptually why all entries on rows 17, 20, 21, 22, and 24 are multiples of 3. 
For quadruplets, the possible Csub’s are (0,4, 4, 4) in the case Ty and (3, 5, 5,5) 
in the case T 12 , explaining the structure of these rows. 

The mass formulas mentioned already come from specializing mass formulas 
for general p-adic base fields F to the case F is an unramified extension of Qg 
of degree 9, 3, or 1. One can also specialize these formulas to the case that F is 
a ramified cubic extension of Qg, and one gets the last three lines of Table 4.1. 
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The number in the Mg^c^ub the c column is the total mass of isomor- 

phism classes of nonic fields of discriminant 3° with a specified cubic subfield of 
discriminant 3°®“'", where a pair itisjub C K is counted with mass l/|^o| where 
^0 is the group of automorphisms of K stabilizing This is simplest in the 
cases c > 20 where all fields are uniquely primitive and thus |A| = |Ao|. 

5 Slopes in Nonic 3-adic Fields 

The paper [10] provides background on slopes in general. Here we will keep the 
discussion focused on nonic 3-adic fields K and their associated Galois fields 
^gai x]g0 results of the calculations are summarized in Tables 5.1 and 5.2. 

Suppose Ga^ii's^'/Qs) has order 2“3^, so that b G {2,3,4}. Then there 
are b slopes to compute, which we always index in weakly increasing order, 
Si < S2 (<••■)• Gonsider a chain of subfields 

Qa C ^ ^gai.i ^ ^gai.2 ^ , (13) 



with, as indicated, having degree 2“3^ over Q3. There is only one choice 

for Rs^bo as, in all 23 cases not excluded by (9), the group Ty has only one Sylow 



Table 5.1. Slopes in nonic 3-adic fields with discriminant exponent c < 18. Visible 
slopes are in boldface and hidden slopes in ordinary type. 



c 


G 


Slopes 


# 


0 


1 


0,0 


1 


9 


4 


0, 1.5 


~2 


9 


13 


0, 1.5, 1.5 


2 


9 


22 


0,1.5, 1.5, 1.5 


6 


9 


19 


1.125,1.125 


2 


10 


14 


1.25,1.25 


"T 


10 


16 


1.25,1.25 


1 


12 


2 


0,2 


"T 


12 


4 


0,2 


1 


12 


8 


1.5, 1.5 


1 


12 


1 


0,2 


2 


12 


6 


0,2,2 


2 


12 


7 


0,2,2 


1 


12 


11 


0,1.5, 1.5 


2 


12 


13 


0,2,2 


1 


12 


17 


0,2, 2, 2 


9 


12 


22 


0,2, 2, 2 


3 


12 


9 


1.5, 1.5 


1 


12 


19 


1.5, 1.5 


2 



c G 


Slopes 


# 


18 5 


1.5, 2.5 


1 


18 8 


1.5, 2.5 


3 


18 3 


1.5, 2.5 


1 


18 10 


0,1.5, 2.5 


2 


18 10 


1.5, 2, 2.5 


9 


18 11 


0,1.5, 2.5 


2 


18 11 


2,2,2.333 


3 


18 11 


1.5, 2, 2.5 


3 


18 13 


2,2,2.333 


3 


18 17 


0,2,2,2.333 


9 


18 18 


1.5, 2, 2.5 


3 


18 20 


0,2,2,2.333 


9 


18 24 1.5,1.5,1.667,2.5 


3 


18 24 


1.5, 1.5, 2,2.5 


9 


18 14 


2.25,2.25 


3 


18 16 


2.25,2.25 


3 



c 


G 


Slopes 


# 


13 


18 


1.5,1.5,1.667 


2 


13 


20 ( 


1,1.5,1.5,1.667 


6 


13 


19 


1.625,1.625 


2 


15 


4 


1.5,2 


6 


15 


4 


0,2.5 


3 


15 


8 


1.5,2 


2 


15 


13 


0,1.5, 2.5 


2 


15 


18 


1.5, 1.5,2 


4 


15 


20 


0,1.5, 1.5,2 


12 


15 


22 


0,1.5, 1.5, 2.5 


6 


15 


19 


1.875,1.875 


6 


16 


4 


2,2 


3 


16 


7 


0,2,2 


3 


16 


10 


1.5,2,2.167 


6 


16 


11 


0,2,2 


1 


16 


24 : 


1.5, 1.5,2,2.167 


6 


16 


9 


2,2 


1 


16 


19 


2,2 


2 
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Table 5.2. Visible and hidden slopes in nonic 3-adic fields with discriminant exponent 
c> 19. 



c 


G 


Slopes 


# 


19 


4 


2,2.5 


9 


19 


8 


2,2.5 


3 


19 


12 


1.5,2.5,2.667 


9 


19 


13 


1.5, 2, 2.5 


3 


19 


18 


1.5, 2, 2.5 


3 


19 


18 


1.5,2.5,2.667 


9 


19 


20 


0,1.5,2.5,2.667 


18 


19 


20 


1.5,2,2.5,2.667 


27 


19 


24 


1.5,2,2.5,2.667 


9 


19 


19 


2.375,2.375 


6 


20 


21 


1.5,2.5,2.667,2.833 


27 


20 


24 


1.5,2.5,2.667,2.833 


27 


21 


12 


1.5,2.5,2.667 


27 


21 


13 


2,2.5,2.833 


9 


21 


18 


1.5,2.5,2.667 


9 


21 


18 


2,2.5,2.833 


9 


21 


22 


1.5,2,2.5,2.833 


9 


21 


24 


1.5,2,2.5,2.833 


9 



c G 


Slopes 


# 


22 1 


2,3 


9 


22 3 


2,3 


1 


22 6 


0,2,3 


6 


22 10 


0,2,3 


2 


22 10 


2,2,3 


6 


22 11 


2,2.5,2.833 


9 


22 17 


0,2, 2, 3 


18 


22 18 


2,2.5,2.833 


9 


22 20 


2,2,2.333,3 


27 


22 22 


2,2,2.333,3 


9 


23 22 


2,2.5,2.833,3.167 


27 


23 24 


2,2.5,2.833,3.167 


27 


24 21 


1.5,2.5,2.667,3.167 


27 


24 24 


1.5,2.5,2.667,3.167 


27 


25 20 


2,2.5,2.833,3.333 


81 


25 24 


2,2.5,2.833,3.333 


27 


26 3 


2.5, 3.5 


3 


26 10 


0,2.5, 3.5 


6 


26 10 


2, 2.5, 3.5 


18 


26 21 


1.5,2.5,2.667,3.5 


54 



3-subgroup. There may be several choices for some of the intermediate , 

and then of course RTsai.t _ We require that the intermediate fields be 

chosen such that the discriminant exponents Cj are the minimum possible for 
Galois subfields of degree 2“3b the cj are then uniquely defined. The j*'' slope 
is then given by the formula 



^ (14) 

^ 2“3^-2“3J-i 

an instance of Proposition 3.4 of [10]. Note that we are allowing 0 as a slope 
here. Otherwise the slopes are > 1 and correspond to wild ramification. 

Two of the b slopes just discussed are visible in K itself in the follow- 
ing sense. Let 3° be the discriminant of K. If K has a cubic subfield, let 
.ffsub be a cubic subfield of minimum discriminant 3°=“*’. Then from the tower 
Qa C Lfsub C K, one has slopes s„ = Cgub/2 and Sy = (c — Csub)/6, another 
instance of Proposition 3.4 of [10]. In this setting, s„ < s„. If K is primitive, 
then the visible slopes are s„ = s„ = c/8. An important point is that the highest 
slope is always visible, i.e. = Sf,. 

We call the remaining 6 — 2 slopes hidden. A priori, one might have ex- 
pected that to calculate them, one would have to compute field discriminants of 
resolvents of relatively high degree, perhaps 27 say, and apply Proposition 3.4 
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of [10] yet again. However, as we explain next, in fact one needs to compute 
discriminants only of associated nonics. 

First, if K has > 2 cubic subfields or 0 cubic subfields, then 6=2 and so there 
are no hidden slopes to compute. Now consider the 14 possible Galois groups 
of a uniquely imprimitive 3-adic nonic. From Table 3.2, T\ and T 3 have 6=2, 
while Tg, Ty, Tio, Tn, T 12 , T 13 and Tis have 6=3, and finally T 17 , T 20 , T 21 , 
T 22 , and T 24 have 6 = 4. 

In the two 6 = 2 cases, we are again done. For the seven 6 = 3 cases, we can 
consider the nonic resolvent Kg^. One has an exact sequence 

C 3 Gal(i^s"VQ3) ^ Gal(iF|^VQ3). (15) 

Now the two nonic groups of order 27, namely Tg and Ty, both have only one 
normal subgroup of order 3. So the remaining groups presently under consider- 
ation, Tio, Til, Ti 2 , Ti 3 and Tig, also have only one normal subgroup of order 
3. So it is necessarily the highest slope S 3 of K which disappears upon pas- 
sage from K to Kg^. Only one of the remaining slopes Si and S 2 is visible in 
K, but both are visible in Kg^, allowing us to identify all three of si, S 2 , and 
S3. This computation would work equally well replacing Kg^ by Kgy, as indeed 

ivgal _ T^gal 

^9x ^9y ' 

Finally for the five 6 = 4 cases, we can use again the nonic resolvent Kg^ and 
the exact sequence (15). The slopes of K are Si, S 2 , S 3 , and S 4 . The 3-group Tiy 
has just one normal subgroup of order 3, and hence the normal overgroups T 20 , 
T 21 , T 22 , and T 2 A also have just one normal subgroup of order 3. So the slopes of 
Kgx must be si, S 2 , and S 3 , which are all identified by the previous paragraph. 
The highest slope S 4 is identified too, so all slopes of K have been identified. 
Again this computation would work equally well with Kg^ replaced by Kgy. 



6 Global Applications 

The last section of [10] concerns degree 13 fields with Galois group PSLg{3) and 
illustrates how ramification can be analyzed completely, with the analysis at 2 
requiring octic 2-adic fields and the analysis at 3 requiring nonic 3-adic fields. 
We are presently pursuing other applications which similarly require a number of 
group-theoretic preliminaries to describe globally. Here we will stay in a setting 
where the only group-theory we need is what we have set up in previous sections. 

Proposition 6.1 A. There are exactly thirteen isomorphism classes of solvable 
nonic number fields with discriminant of the form ±3^, namely the fields K = 
Q[x]/f{x) with f{x) = '^ag-ix'' as given in Table 6.2. 

B. Assuming that Odlyzko’s GRH lower bounds on discriminants hold, there are 
no non-solvable nonic number fields with discriminant of the form ±3^. 

To establish Part A, we need to use the group theory set up in Sections 2 and 
3, but not the specifically 3-adic information of Sections 4 and 5. First, one knows 
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that there are no quartic number fields with discriminant iS**. This implies that 
there are no primitive nonic solvable number fields of discriminant ±3^ because 
all seven solvable primitive groups have a quotient group of the form V 4 , C 4 , A 4 , 
or S 4 , according to Table 3.3. For the same reason, an imprimitive group G can 
only appear if its size has the form 3^ or 2^3^. 



Table 6.2. The thirteen nonic solvable number fields with discriminant of the form 
±3**, sorted by increasing top slope 



c 


G 


ao 


ai 


02 


03 


a4 


05 


06 


07 


08 


09 


Si 


S2 


S 3 


S4 


19 


4 


1 


0 


0 


-3 


0 


0 


-6 


0 


0 


-1 


2 


2.5 






21 


13 


1 


0 


0 


-3 


0 


0 


0 


0 


0 


1 


2 


2.5 


2.833 




22 


11 


1 


0 


0 


-3 


0 


0 


3 


0 


0 


8 


2 


2.5 


2.833 




22 


1 


1 


0 


-9 


0 


27 


0 


-30 


0 


9 


1 


2 


3 






23 


22 


1 


0 


0 


-6 


0 


0 


9 


0 


0 


-3 


2 


2.5 


2.833 


3.167 


23 


22 


1 


0 


0 


-3 


0 


0 


0 


0 


0 


3 


2 


2.5 


2.833 


3.167 


23 


22 


1 


0 


0 


-3 


0 


0 


-9 


0 


0 


3 


2 


2.5 


2.833 


3.167 


25 


20 


1 


0 


-9 


-6 


27 


36 


-24 


-54 


-9 


22 


2 


2.5 


2.833 


3.333 


25 


20 


1 


0 


-9 


-3 


27 


18 


-24 


-27 


-9 


23 


2 


2.5 


2.833 


3.333 


25 


20 


1 


0 


-9 


-3 


27 


18 


-15 


-27 


-36 


-4 


2 


2.5 


2.833 


3.333 


26 


3 


1 


0 


0 


-9 


0 


0 


27 


0 


0 


-3 


2.5 


3.5 






26 


10 


1 


0 


0 


0 


0 


0 


0 


0 


0 


-3 


2 


2.5 


3.5 




26 


10 


1 


0 


0 


-9 


0 


0 


27 


0 


0 


-24 


2 


2.5 


3.5 





It is also known that there are exactly two cubic fields with discriminant 
±3^, namely Q[x]/(a;^ — 3x — 1) with Galois group C 3 and Q[x]/(a;^ — 3) with 
Galois group S 3 . So we need look only at cubic extensions of these fields, using 
the exhaustive method described in Ghapter 5 of [3]. Since neither of the two 
cubic fields contains cube roots of unity, the method requires us to adjoin cube 
roots of unity to get sextic fields Kq, and look within degree eighteen overfields 
of these to get the desired nonic fields. The method requires that Ki^/Kq be 
abelian, but abelianness is ensured by ord 2 (|G|) < 1. These computations, which 
of course we have only briefly sketched here, establish Part A. 

Before moving on to establishing Part B, we will comment on some ways 
that Table 6.2 illustrates our previous sections. The Tn field and the T 13 field 
form a twin pair. Similarly, the three T 20 fields and the three T 22 fields each 
form a triplet. Triplets have a cyclic order and if we call the top-listed field in 
each triplet Ka, then the next is AT{, and the final one is Kc- Slopes are given in 
the same format as Tables 5.1 and 5.2. The fact that the small visible slope can 
change within twins but not triplets is illustrated. 

Part B is similar to some other non-existence statements in the literature, for 
example the statement in [16], which says in particular that there are no PSL 2 { 8 ) 
nonics with discriminant of the form ±2“. To establish Part B, we use the 3-adic 
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analysis of Sections 4 and 5, including the determination of hidden slopes. If K 
is a nonic 3-adic field with slopes si < • • • < s;, and = t, then 

the root discriminant of is 3^ with 

^ 2 2 1 t - 1 

/3- gSfc + --- + ^si + • (16) 

This type calculation is explained further in [10]. From Table 5.1, one sees that in 
the cases b = 2, b = 3, and 6 = 4, the largest that /3 can be is respectively 53/18 = 
2.94, 55/18 = 3.05, and 511/162 « 3.15432, these bounds all being realized in 
the largest discriminant case c = 26, always with t = 2. The corresponding 3^ 
are then approximately 25.40, 28.70, and 31.99. These numbers are thus upper 
bounds for the root discriminant of a Galois number field with discriminant ±3^ 
and Galois group PSL{2,8), PSL{2,8) and (Ag or Sg) respectively. However 
Odlyzko’s GRH bounds say that a field with root discriminant at most 25.40, 
28.70, and 31.99 respectively must have degree at most 380, 1000, and 4400 [12]. 
These numbers are respectively less than |PS'L(2,8)| = 504, \PEL(2,8) \ = 1512, 
and |Hg| = 9!/2, giving Part B. 

References 

[1] G. Butler and J. McKay. The transitive gronps of degree np to eleven. Comm. 
Algebra, 11(8):863-911, 1983. 

[2] H. Cohen. A course in computational algebraic number theory. Springer- Verlag, 
Berlin, 1993. 

[3] Advanced topics in computational number theory, volume 193 of Graduate 

Texts in Mathematics. Springer- Verlag, New York, 2000. 

[4] A. Colin. Relative resolvents and partition tables in Galois group computations. 
In Proceedings of the 1997 International Symposium on Symbolic and Algebraic 
Computation (Kihei, HI), pages 78-84 (electronic), New York, 1997. ACM. 

[5] J. H. Conway, A. Hulpke, and J. McKay. On transitive permutation groups. IMS 
J. Comput. Math., 1:1-8 (electronic), 1998. 

[6] H. Darmon and D. Ford. Computational verification of Mu and M 12 as Galois 
groups over Q. Comm. Algebra, 17(12):2941-2943, 1989. 

[7] Y. Eichenlaub. Problemes effectifs de theorie de Galois en degres 8 a 11. PhD 
thesis, Universite Bordeaux I, 1996. 

[8] A. Hulpke. Techniques for the computation of Galois groups. In Algorithmic 
algebra and number theory (Heidelberg, 1997), pages 65-77. Springer, Berlin, 1999. 

[9] J. W. Jones and D. P. Roberts. Octic 2-adic fields. In preparation. 

[10] A database of local fields. Submitted, 2003. 

[11] M. Krasner. Remarques au sujet d’une note de J.-P. Serre: “Une ‘formule de 
masse’ pour les extensions totalement ramifiees de degre donne d’un corps local”. 
C. R. Acad. Sci. Paris Ser. A-B, 288(18):A863-A865, 1979. 

[12] J. Martinet. Petits discriminants des corps de nombres. In Number theory days, 
1980 (Exeter, 1980), volume 56 of London Math. Soc. Lecture Note Ser., pages 
151-193. Cambridge Univ. Press, Cambridge, 1982. 

[13] P. Panayi. Computation of Leopoldt’s p-adic regulator. PhD thesis. University of 
East Anglia, 1995. 




308 



J.W. Jones and D.P. Roberts 



[14] S. Pauli and X.-F. Roblot. On the computation of all extensions of a p-adic field 
of a given degree. Math. Comp., 70(236):1641-1659 (electronic), 2001. 

[15] J.-P. Serre. Une “formule de masse” pour les extensions totalement ramifiees de 
degre donne d’un corps local. C. R. Acad. Sci. Paris Sir. A-B, 286(22):A1031- 
A1036, 1978. 

[16] J. Tate. The non-existence of certain Galois extensions of Q unramified outside 
2. In Arithmetic geometry (Tempe, AZ, 1993), volume 174 of Contemp. Math., 
pages 153-156. Amer. Math. Soc., Providence, RI, 1994. 




Montgomery Addition for Genus Two Curves 



Tanja Lange 

Institute for Information Security and Cryptology (ITSC) 
Ruhr-Universitat Bochum, Universitatsstrafie 150, D-44780 Bochum, Germany 
langeSitsc . ruhr-uni-bochum . de 



Abstract. Hyperelliptic curves of low genus obtained a lot of attention 
in the recent past for cryptographic applications. They were shown to be 
competitive with elliptic curves in speed and security. In practice, one 
also needs to prevent from side channel analysis, a method using infor- 
mation leaked during the process of computing to attack the system. For 
elliptic curves the curve arithmetic proposed by Montgomery requires a 
comparably small number of field operations to perform a scalar multi- 
plication but at the same time achieves security against non-differential 
side channel attacks. 

This paper studies the generalization of Montgomery arithmetic for 
genus 2 curves. We do not give the explicit formulae here, but together 
with the explicit formulae for affine or projective group operations the 
results show how to implement it. The divisor classes can be represented 
using only their first polynomials, a feature that is important for actual 
implementations. Our method applies to arbitrary genus two curves over 
arbitrary fields of odd characteristic which have at least one rational 
Weierstrafi point. 

Keywords: Hyperelliptic curves, Montgomery arithmetic, fast 

arithmetic, cryptographic applications 



1 Introduction 

For integer factorization using elliptic curves, Montgomery [14] proposed the use 
of a special class of curves to obtain fast computation of scalar multiples. The 
computation of m-folds is also the main operation in public-key cryptosystems 
based on the discrete logarithm problem. For these applications, Montgomery 
arithmetic offers the additional advantage that it prevents attacks like simple 
power analysis which determine the secret multiplier by observing the chain of 
additions and doublings corresponding to the digits of the multiplier in binary 
expansion. There are other methods to achieve this goal but they are slower 
and need more storage. For many implementations, space is restricted. There- 
fore, these curves are attractive for implementations, and generalizations to even 
characteristic [12] and curves in general WeierstraB form [3] have been obtained. 

The use of hyperelliptic curves in cryptography was already proposed in 1989 
by Koblitz [9] but it is a rather recent result that they can compete with elliptic 
curves in terms of efficiency of the group law [1,11]. The security of low genus 

D.A. Buell (Ed.): ANTS 2004, LNCS 3076, pp. 309-317, 2004. 
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hyperelliptic curves is assumed to be similar to that of elliptic curves of the same 
group size. Here, small really means genus 5 < 3 by [7,18], and even for 5 = 3 
some care has to be taken. 

In this paper we show how to extend Montgomery’s ideas to genus 2 curves. 
We concentrate on this case at is allows to state the expressions within one line. 
The main problem introduced by increasing the genus appears already in this 
case. We show how to add two group elements given only the first part of their 
representation and provided that this part is also known from the difference of the 
two elements. We do not give the explicit formulae but provide the ingredients 
starting from which one can derive the recipe for implementation by following 
the ideas in [11]. We concentrate the efforts to show that one can reduce the 
number of variables for the computations, which corresponds to the fact that for 
Montgomery arithmetic on elliptic curves one does not use the y-coordinates. 
This means that the proposed arithmetic does not need more variables than 
usual arithmetic while preventing side-channel attacks and obtaining acceptable 
performance. In this paper we do not deal with special classes of curves that 
can offer a better performance but try to be as general as possible allowing all 
special choices to be included. 

We first give an introduction to the arithmetic on hyperelliptic curves, in- 
cluding elliptic curves. Then we briefly review scalar multiplication using the 
Montgomery ladder and present Montgomery arithmetic for elliptic curves, giv- 
ing a mathematical reason for the approach taken in [3] . The core of the paper is 
to show how genus two Montgomery arithmetic works. We end with an outlook 
on the performance of the algorithm. 

2 Background on Hyperelliptic Curves 

Within the scope of this paper we can only give a short introduction to the topic. 
More details can be found in [6,13,17]. 

Let Fq be a finite field with q elements of odd characteristic p. Assume that 
C is & projective curve over without singularities. The F^-rational divisors of 
C are formal sums of points (over Fg) of C which are invariant under Gf, := 
Aut(Fq/Fq). The degree of a divisor D is the sum of the multiplicities of the 
points occurring in it and is denoted by deg(D). A divisor is effective if all 
multiplicities are non negative. We define the divisor class group by the following 
rule: two divisors are in the same class iff their difference consists of the zeroes 
and poles (with multiplicity) of a function / G Fq(G), i. e. they differ only by 
the principal divisor (/) attached to /. The F^-rational points of the Jacobian 
variety of G, Jc(Fq), correspond to the F^-rational divisor classes of degree 0 of 
G. An effective version of the Theorem of Riemann-Roch allows to turn this into 
an algorithmic procedure to work in Jc(Fg). 

By Riemann-Roch, each class can be represented uniquely by a divisor of 
the form D = X[pieC(F )\p ~ mPoo, where m = < g and Poo is a 

point at infinity. These representatives are called reduced. If one adds two classes 




Montgomery Addition for Genus Two Curves 311 



D \ , D 2 one determines a function passing through the points of the representing 
divisors D \ , D 2 with appropriate multiplicity. The points of intersection not in 
the support of D\ , D 2 form a divisor H3 such that Di + D 2 + D 3 = 0, as the 
points on the function belong to a principal divisor. In the next step one looks 
for a function passing through the points of D3 providing with D3 + D4 = 0, 
i. e. D\ + D 2 = D 3 . 

In this paper we concentrate on hyperelliptic curves, including elliptic curves. 
We define a hyperelliptic curve (7 to be a projective irreducible non singular curve 
of genus g > 1 with a generically etale morphism tt of degree 2 to P^. If C has 
at least one F^-rational WeierstraB point (a fixed point under tt) then the affine 
part of C given by 

Ca-y^ = f{x), f G Fg[a:], monic,deg(/) = 2g + 1. 

The procedure can be visualized easily for genus 2 curves over the reals. 




For implementations one needs to have a space-efficient representation of 
the divisor classes and also to be able to describe the arithmetic in terms of 
these representatives. One uses the Mumford representation [15] [page 3.17] of 
the divisor classes. As the divisor class group is isomorphic to the ideal class 
group of the function field belonging to C one has 

Theorem 1 (Mumford Representation). 

Let the function field he given via the absolutely irreducible polynomial + 
h{x)y — f{x), where h, f € Fq[x], deg / = 2g + 1, degh < g. Each nontrivial 
ideal class over Fg can he represented via a unique ideal generated by u{x) and 
y — v{x), M, wGFg[a;] , where 
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1. u is monic, 

2. deg V < deg u < g, 

3. u\v'^ + vh — f. 

Let D = X)i=i Pi ~ 'I' Poo, where Pi ^ P^o, Pi ^ iPj for i ^ j and r < g. Put Pi = 
(ai,bi). Then the corresponding ideal class is represented by u = 

and if Pi occurs Ui times then {^Y [v{x)‘^ + v{x)h{x) — f^x)]^ = 0, 0 < 

j <ni-l. 

For the arithmetic one can use Cantor’s algorithm [4], which is just an algo- 
rithmic description of the steps detailed for the divisor classes. 

Algorithm 1 (Composition) 

INPUT: Di = [ui,Vi],D 2 = [U2,V2], C:y'^ = f{x). 

OUTPUT = [u,v] semi-reduced with D = D\D2. 

1 . compute di = gcd(t6i,U2) = eiUi + 62^2 ; 

2. compute d = gcd{di,vi + V2) = cidi + C2{vi + V2) ; 

3. let Si = CiCi, S2 = CiC2, S3 = C2 ; 

4 . u=^; 

y _ SiUiV2+S2U2Vl+S3(viV2+f) ^ 

Algorithm 2 (Reduction) 

INPUT: D = [u,v] semi-reduced. 

OUTPUT : = [m', u'] reduced with D = D' . 

1 . let u' = , v' = (— r>) mod u' ; 

2 . if degu' > g put u = u',v = v' ; 
goto step 1 ; 

3. make u' monic. 

The picture corresponds to the case gcd(ui,U 2 ) = 1 which is the most fre- 
quent one. In practice, the other cases - except for doublings - can be neglected. 
To actually implement the arithmetic on hyperelliptic curves, special explicit 
formulae offer much better performance than Cantor’s algorithm (see [1,11]). 

3 Montgomery Ladder 

Here we state how to compute scalar multiples using the Montgomery ladder. 
Details on cryptographic applications can be found in [8] . 

Let G be a group and D € G. Assume that we want to compute nD for a pos- 
itive integer n = X)i=o The obvious way is to do double- and- add according 
to the binary expansion. As mentioned above, for cryptographic applications it 
is desirable to have indistinguishable operations - the sequence of additions and 
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doublings should not reveal any information about the digits of n. The easy way 
out is to perform a double-and-always-add algorithm. This has the disadvantage 
that one looses performance but an observer only sees that each doubling is fol- 
lowed by an addition. One needs to pay attention to modify the input for the 
dummy operations as otherwise one could distinguish between real and dummy 
additions by the power-trace. Consequently, one can just as well perform useful 
additions in each step that always depend on the output of the previous step. 
Put Ej = Yl\=j ni2^~^D and Fj = Ej + D. We have 

Ej = 2Ej^i TijE = Ej^i Fj^i TijD — E = 2Tj_|_r -t- tijE — 2E. 

This implies 

/El F 'i = / ^i+i) ~ ^ 

\{^Ej+i+Fj+i,2F,+^) ifn, = 1 

This technique is applicable for any group and it allows to hide the sequence 
of bits in n, e. g. one could always perform the doubling first and then the 
addition to get a uniform performance. For an integer of length I this requires I 
additions and I doublings. Additionally, the number of variables is doubled. For 
the following we note, that the difference of Fj and Ej is kept constant - in each 
step it is equal to D. 

If affine coordinates are used in curve based cryptography the inversions in 
the addition and in the doubling can be performed together using Montgomery’s 
trick. This is an obvious saving but usually inversion free systems are applied. 

4 Montgomery Arithmetic on Elliptic Curves 

In the previous section we obtained a method that hides the binary expansion 
of the multiplier. We now describe the idea behind Montgomery arithmetic, 
partially following [3]. We concentrate on the feature that one would like to 
reduce the amount of storage needed to represent a point, namely that one can 
get rid of the y-coordinate. We do not touch the doubling as this can obviously 
be done by using division polynomials (see e. g. [2,16]). 

An elliptic curve is a curve of genus 1. For p > 3 it can be given by a 
WeierstraB equation E : + a^x -I- oe, G Fg. Montgomery proposed 

to use curves of the form = x^ + Q 2 X^ + a^x. Such curves can always be 
transformed to a WeierstraB equation but the converse is not true. Curves in 
Montgomery form have at least one point of order 2 leading to a non-prime 
group order. Therefore, it was desirable to have an analogue of Montgomery 
arithmetic for the general curves as well. 

In Section 2 we introduced the representation of points on the Jacobian via 
polynomials. For elliptic curves the Jacobian is isomorphic to the group of points. 
If a point is given by {xp, yp) then the corresponding Mumford representation is 
[x — xp^yp]. Therefore, the arithmetic is usually given in terms of coordinates. 
To motivate the procedure in the genus 2 case and to obtain faster additions 
than [3] we stick to Cantor’s algorithm for now. 




314 T. Lange 



Assume that we are given the first polynomials ui = x — xi,U2 = x — X2 
of £>1,1)2 and U- = X — X- oi D_ = D\ — £>2 and that we want to compute 
£>+ = £>i + £>2, u+ = X — x+. Additionally we assume that x\ yf X2 as otherwise 
we would either double a class or end up with the trivial element. Therefore, 
c?i = d = 1 in Algorithm 1 and si = ei = ^/{x2 — x\) = —62 = — S2- In Step 4 
the intermediate results are u+ = U- = U1U2 and v+ = {uiV2 — U2V\)/{x2 — x\) 
mod u+,V- = {—uiV2 — U2Vi)/{x2 — xi) mod The modular reduction is not 
necessary as the degree of v± is bounded by 1. The Vi = yi are unknown, but 
yf = f{xi) as {xi,yi) G E and for the difference we know the resulting u'_. This 
motivates taking the product of u\ with u'_ leading to 



£_|_ U-_ 



= X^ — {x+ + X-)x + Xj^X- 

_ if - iu\vl + ulvl)/{x2 - 



if-il)if-il) 

u\ul 

XiYY — Au\u2v\v2 ! [X2 — X\Y 
2 2 
uiui 



Multiplying out leads to 



x+ 



2 

{Xi - X2)2 



(2ae + (a;i + X2){a4 + X 1 X 2 )) 



X-. 



The computation of cc+ needs 4 multiplications and 1 inversion. This means a 
saving of 2 multiplications compared to [3]. They used the term x+X- instead of 
X4. + X- . However, in most application one avoids inversions as they are much 
more costly than multiplications. Allowing a further coordinate Zi per point leads 
to 



= 2(2o6(ziZ2)^ + {xiZ 2 + X2Zi){a4ZiZ2 + X 1 X 2 )) — X-{x\Z 2 ~ X 2 Z\f' , 

Z+ = {xiZ2 - X2Zif 

This needs 7 multiplications and 3 multiplications by fixed constants just as the 
formulae in [3]. 

5 Montgomery Arithmetic for Genus Two Curves 

In this section we try to mimic the approach taken in the previous section. We 
only consider the case of co-prime U\,U2, as the other ones are very unlikely 
to be needed and doubling can be done with Cantor’s generali zation of the 
division polynomials [5]. Like for elliptic curves they only involve the u-part of 
the representation. 

For genus two curves two main problems occur: the intermediate is ac- 
tually reduced and one only knows f = mod u. Thus, is given only up to 
a multiple of u. Furthermore, the resulting u'_^ will not be monic, but we here 
know the leading coefficient. 

There are obvious reasons for this behavior. Each fixed u-polynomial belongs 
to 4 different divisor classes depending on the signs of the y-coordinates of the 
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points. Therefore, / mod u cannot contain all information on v. But, if the 
difference of two classes is known, this cancels out two possible assignments of 
the signs, leading to a unique result. This can be visualized in the picture above: 
u fixes only the a;-coordinates of Pi,Qi, hence there are 4 different groupings 
of the points to form the two divisors. Knowing that there is some operation 
leading to the x-coordinates of Ri reduces the choice to 2 possibilities - the one 
depicted and the one obtained as inflecting at the x-axis and both will lead to 
the same w-part of Pi + P2 — Qi — Q2 — 4 Poo- 

To proceed we need to introduce some variables. Keep Cj from Algorithm 1 
and put Ui = x'^ + Unx + Um, bi = bux + bm = (ejU3_i)^ = ef/ mod u^-i and 
Ci = Ciix + Cio = 6iV3-i mod us-i. Then cf = bi + aiU^-i for some constant Oi 
with 

c^i = ai{u is monic ) 

2CiiC^o — bil T U(^ 3 _i^i(li 
~ ^iO + U( 3 -i) 0 ai- 

Therefore, ai is a root of the quadratic polynomial 

hi{x) = (w(3_qi - 4M(3_qo)a;^ + (2u(3_i)i6*i - Abio)x + bj^. 

We now use that for each polynomial PI one has u\P[ = u\P[ mod U1U2, 
where P[ = H mod M2, to simplify the expression of To compute u+ we 

only need to compute the three leading terms 

(*) u+m'_ = (oi — 02)^(a;‘* — (m+i + M_i)a:^ + (u+iu_i + u+o + M_o)a;^ + . . . ) 

ulul 

{ul{bi + aiU2) + ul{b2 + a2Ui) - ff 

— ^ ^2 ^("1 ®1'*^2)(02 + CI2U1). 

Due to the construction of the bi this last expression is alway divisible by (miM 2)^. 
To recover u_|_ is thus suffices to get the a^. In principle one could factor the 
above hi and choose the right ai from the fact that Oj = is a square. But 
this would imply taking square-roots which is a rather costly operation in the 
scale we are working in here. However, we know more, namely we know that the 
resulting polynomial in (*) of degree 4 is divisible by m_. Formally computing 
s = SiCC -k So = ("1 (^1 +°1 "2 )+« A&2+a2 ) -/) _ OiM 2)(62 + 02^1) mod u_ 

leads to Si which are quadratic in Oj. To obtain Oj one computes the resultants 
Ti := res(si, ft.3_i) with respect to a^-i and obtains x — Ui = gcd{ri,hi). In 
the computation of r* one could just as well take Sq. A detailed breakdown to 
obtain explicit formulae will allow to choose the one which needs less operations. 
Experiments do not show any advantage in choosing either si or sq. 

This explanation allows to derive explicit formulae like Cantor’s algorithm 
can be made explicit. E. g. to compute (*) one should not compute the numerator 
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and then divide by (uiM 2 )^ but divide by U 2 form the very beginning just like 
in the usual explicit formulae. However, the resulting algorithms require more 
operations than an addition involving the v. The main objective of this paper 
is to show that Montgomery arithmetic is possible for genus 2 curves. Finding 
curve equations that allow faster computations and determining inversion free 
formulae is a topic of current research. 



6 Conclusion and Outlook 

We have provided the mathematical background to implement Montgomery 
arithmetic on genus two curves. Detailing the group operations as explicit for- 
mulae is beyond the scope of this paper. We have shown that for Montgomery 
arithmetic the space requirements are considerably lower, namely they are just 
the same as for usual arithmetic where one keeps the second polynomial. To find 
curves where this process also allows faster arithmetic is work in progress. For 
an implementation it might be wise to avoid inversions. An approach similar to 
that for elliptic curves based on [10] will do this. 

For genus two curves one can also recover the polynomial Vi from knowing 
the full representation [u 2 ,V 2 ] and For that we take Vu,Viq as variables 

and expand equation (*) including the knowledge of the coefficients of V 2 - This 
leads to a system of equations of degree 2 in vn,uio which can be solved by 
computing resultants. 

For genus g the number of unknowns aij is 2{g— 1). Accordingly this requires 
more operations before can be computed. 
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Abstract. Let A be a totally real number field. It is known that the 
values C,K{—n) of the Dedekind zeta function C,k[s) of K are rational 
numbers for all non- negative integers n > 1. We develop a rigorous and 
reasonably fast method for computing these exact values. Our method 
is in fact developed in the case of totally real number fields K of any 
degree for which C,k[s) / C,{s) is entire, which is conjecturally always the 
case (and holds true if K is cubic or if A/Q is normal). 



1 Introduction 

Let K he & totally real number field of degree m > 1. Let (Ik and Ck(s) denote 
respectively its discriminant and Dedekind zeta function. Set Ak '■= y/diy/Tr™. 
It is well known that Ak{s) := D'"(s/2)Ciy(s) is meromorphic, with only two 

poles, at s = 1 and s = 0, both simple, and satisfies the functional equation 
Ak{1 — s) = Ak{s). Hence, if — 2n < 0 is even then (^^y(— 2n) = 0, and if 
1 — 2n < — 1 is odd then 

Ck( 1 - 2n) = “ 

(use the functional equation r{s)r{l — s) = tt/ sin(Trs)). Moreover, according 
the Siegel-Klingen’s theorem, if n > 1 is an integer, then C,K{—n) is a rational 
number. We want to explain how one can efficiently compute these rational 
integers. To begin with, if K is abelian and if we let Xk denote the group (of 
order m) of primitive even Dirichlet characters associated with K, then C,k{s) = 
YIxgXk known formulae for L{—n, y) make the computation of 

CK{—n) straightforward. Indeed, if y is a character modulo / > 1 then 

L(1 - 2n,x) = (n > 1) 

where 

e/‘-l ”’^n! 

a—1 n—0 

D.A. Buell (Ed.): ANTS 2004, LNCS 3076, pp. 318-326, 2004. 
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(e.g., see [Was, Theorem 4.2]. See also [Lou3, Formula (1) page 70] for another 
formula for L(2n,x) for even Dirichlet characters x). In particular, if x modulo 
/ > 1 is even, then 

1 ^ 

0=1 

which in using C(~l) = —1/12 yields 

= 12(2 f/i-- n (1) 

for a real cyclic field K of conductor /^ > 1 and prime degree m > 1 associated 
with a primitive modulo Jk even Dirichlet character \k- If K of square- free 
conductor //f > 1 is defined by some Q-irreducible polynomial Pk{x) of degree 
m with integral coefficients, we already explained in [Lou6, Section 2] how one 
can recover a character \k of order m generating Xk prior to using (1). In 
particular, we used this technique for checking the Tables published in [KK], 
and realized that almost all the values in these Tables are false! Indeed, we 
give in Table 1 below an excerpt of the computation we did for cyclic simplest 
cubic fields Km associated with the Q-irreducible cubic polynomials Pm{x) = 
+ mx^ — {m+ 3)x -I- 1 (with m > — 1 such that Dm ■= rn? + 3m -I- 9 is square- 
free). We point out that whereas our values do not agree with those obtained 
in [KK], they do agree with the ones obtained by using the functions zetakinit 
and zetak of the software Pari GP , i.e. by using the functions initially written 
by E. Tollis according to his method developed in [Tol] 

factor{round{— 21* zetak{zetakinit{subst{x^ + mx‘^ — {m+5)x+l, m, •)), — 1))), 

and that the authors of [KK] could have used this software to realize that there 
was something going wrong with their computation. In fact, after having met 
H. K. Kim in Tokyo last October, he told us that Theorem 3.2 in [KK] is false, 
which explains their problems with the results of their computation. 



2 A Method for Computing C*r(l ~ 2n) for Some Totally 
Real Number Fields K 

In order to compute the exact value of the rational number — 2n) G Q, we 
will compute good enough numerical approximations to it and then use some 
information on the size of its denominator to deduce its exact value. 

Theorem 1. (Siegel, see [Zag]). Let n > 1 and m > 1 be positive integers. Set 



h = 2mn and r = 



[h/12] ifh = 2 (mod 12) 

[/i/12] -I- 1 ifh^2 (mod 12). 
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There exist h\{h), • • • , br{h) G Q such that for any totally real number field K of 
degree m we have 



r 

Ck( 1 - 2n) = 2”^ ^ bi{h)Sl^{2n) G Q, 

where 

Sl^{2n) = Y. E G Z>o 

and Tr^^Q(u) = l 

(T^k/q being the different of K and X ranging over the integral ideals of K 
dividing the integral ideal {v)T>x/ci)- particular, if K is a totally real cubic 
number field, then 

-63Ck(-1) = S^{2) G Z>o. 

Prom now on we assume that C,k{s) /({ s) is entire, which holds true if 
K is either a cubic number field or a normal number field (and see Remark 1 at 
the end of the paper to see how to adapt our method in the case that Ck{s) /C{s) 
is not known to be entire). Our method is based on [Loul] and [Coh, Section 
10.3]. We set 

^K/Q = ^icMq = V (2) 

and 

Ak/q{s) := Rk(s)/^q(s) = ^^/Qr™“^(s/2)CA:(s)/C(s), (3) 

which is entire and satisfies the functional equation Ax/q{1 ~s)= Aj^/q{s). Let 

^ /* C +200 

Sk/q{x)--=— Aii/Q{s)x~'^ds (c> 1 and a; > 0) (4) 

ZTTt J (2 — ioo 



denote the Mellin transform of Ax/q(s). Since Ax/q(s) is entire, it follows that 
Sk/q{x) satisfies the functional equation 

Sk/q{x) = -^k/q{-) (5) 

(shift the vertical line of integration 5ft(s) = c > 1 in (4) ot the left to the 
vertical line of integration 5ft(s) = 1 — c < 0, then use the functional equation 
^ic/q(l ~ s) = ^k/q{s) to come back to the vertical line of integration 5i(s) = 
c > 1), and 

dr dr 

Ak/q{s) = J^ Sx/q{x)x^-= Sx/Q{x){x^ + x^-n- (6) 

is the inverse Mellin transform of Sk/ci{x). Finally, we set 



1 

2m 



nC-\-i00 

/ F™-^(s/2)x-"ds 

J C — 200 



(c > 1 and X > 0) . 
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Notice that 0 < for a; > 0 (see [Lou4, Last assertion of Theorem 

20] (note however the misprint there for we should have written -k <l> 2 {x) = 
<Pi{t)<p 2 {x/t)Y))- Now, write 

Cif(s)/C(s) = 

k>l 



Then, 



by (3) and (4), 



Sk/q{x) = (j)kHm-l{kx/AK/Q), 
k>l 



/ oo 

Hm_i{kx/AK/Q){x^~'^'^ 

K^l 






dx 

•) 

X 



by (6), and: 

Proposition 1. For m > 1 and n> 1, set 

rm,2n = P'-’"(^^)C(1 - 2n) = (-l)N— _ 2n) ' • 

For A> 0, set 

dx 1 fC+ioo , 

/„,„(^):=/ H^_,{Ax)x^— = — F^{s/2)A-^ (7) 

Ji X 2m Jc-ioo s-n 

(with c > n € Z). Then 

C-fc(l ~ 2n) = rm,2n^^/Q^ y]] </>fc|dm-l,l-2n(^^^ ) + dm-l,2n(^^^ )|- (8) 

^/Q ^/Q 

In particular^ if K is a totally real cubic field, then 

C/f(~l) = ^2 </'fc|^2,-l(^MtC/Q) + ^2,2 (^Mk/q)| G Q<0- (9) 

^ k>l 



Let m > 1 and n > 1 be given. Let K range over a family of totally 
real number fields of degree m (for which C^(s)/C(s) is entire). Our strategy 
for computing the exact values of the rational numbers Cic(l ~ 2n) is now clear. 
According to Theorem 1 there exists dm, 2 n G Z>o such that dm, 2 nCK{^ — 2n) is 
a rational integer. Hence, by computing a numerical approximation Cx (1 — 2n) 
to CkC ~ 2?^) such that \dm,2nC,KC ~ 2^^) — dm.2nCif(l ~ 2n) | < 1/2 we can 
deduce the exact value of C/c(l ~ 2n). For that purpose, we use (8) where we 
disregard the indices k > M where the integer M > 1 is chosen such that 
RK,2n{M) < l/{2dm,2n) where 



RK,2n{M) rm,2nA^^2 



4>k\jm-l,l-2n{k/ Ax/Ci) + Im-l,2n{k / A^ / ofi 



k>M 
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According to Proposition 2, 

M > A;,/Q((2n + o(l))logA;,/Q)(™-i)/2 

will do (where o(l) tends to zero as dx — > c»). Then, it only remains to explain 
in Proposition 3 how one can compute efficiently as good as desired numerical 
approximations to /m-i,i- 2 n(^/A;f/Q) and dm-i, 2 n(^/A;f/Q) for 1 < fc < M. 

We give in Table 2 below an excerpt of the computation of C,Km (~1) did for 
the non-normal totally real cubic fields Km (studied in [Lou5]) associated with 
the Q-irreducible polynomials Pm{x) = — mx'^ — {m+ \)x— 1 of discriminants 

Dm = {rn^ + m — 3)^ — 32. Here, we assume that (i) 4 < m ^ 3 (mod 7) is 
such that Dm is square-free, or (ii) 4 < m = 3 (mod 7) is such that Dm/^"^ 
is square- free, in which cases the set {l,Pm,P^} forms a Z-basis of the ring of 
algebraic integers of Km = Q(pm) (see [Lou5, Theorem 3]), where pm is the only 
positive real root of Pm{x). (The computation of the <f>k is explained in [Lou5, 
Section 3], and hm denotes the class number of Km)- 

Lemma 1. Let Im,n{A) be as in (7). It holds that 



hm,n{-^) — 



(tiH ^ ^ tmT^'^ dt\ - ■ ■ dtr, 

t\ ' ' ' tm 



(which increases as n increases) 



pOO 

< mA-”T™-i(n/2) / 



dt 

t 



Hence, setting Cm,n = m{{n — 1)!)"^, we have 






n— 1 I 



which implies 

< IrnAA) < (10) 

and (for A > 



W,i- 2 n{A) < ImMA) < ec„,„e-^'''”A-2-2(— (11) 

Proof. For the first assertion, use (7), Fubini’s theorem and (for c > n) 

j_ ^ ^ r X" if x> 1 

27TZ Jc-ioo s — n ^ \0 if0<a;<l. 

Then, notice that if ti • • • > 1 then at least one of the is > 1. For the last 

assertion, use 



1 

(n — 1)! 



-t.ndt -a; 1 

' ‘ T" Ln" 

i=0 



(for n > 1 an integer). 
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Lemma 2. See [Lou4, Lemma 26]). For k>l a positive integer, set 

dm{k) := 1. 

di---dm — k 

It holds that \4>k\ Si dm-i{k) and 

Dm{x) := ^ ^ < ( ^ -)™ < log’"(ex) (:r > 1). (12) 

/C Tlj 

l<n<ai l<n<£c 

Set 

Rm.2n{M, A) ■= dm{k)Ira,2n{k/ A). 

k>M 

Then, 

RK.2n{M) < 2rm,2nA^j^^^Rm-l,2n{M,Aii/Cl), 
by Lemma 2, and we have: 

Proposition 2. Assume that m > 1. For n>l, M>A>1 and M > 
exp(m^/2(m — l)(n — 1)), it holds that 

Rm.2n{M,A) < 

M ^ m m ^ 

which yields 

RmMM,A) Al-\l0gA)(3W2)+n-mn ^ > A(A log 

For n = 1, ^ > 0 and M > (3m/2)'"/^, it holds that 

Rm. 2 {M,A) <m^{l+ log™(eM). 

M m 

Proof. If / G C^{[M, oo), R) and lim 3 ,_>oo Dm{x)f{x) = 0, then 

f{k) = - f Dm{x)f{x)dx. 

kftr ^ 

Hence, using (11) to obtain 

jImMj) < fi^) ■■= ec™,„e-("/^)''”'(xM)-i-2(— !)("-!)/-, 
and using (12) we obtain 

R^.2n{M,A) < H V “^/(fc) < ec^,„H2 H g{{x/Af/^)% 

^ Jm a;2 
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with 



g{y) = (m/2)™(^ 



1 + 



2(m-l)(n-l) 2 



m J 



and B = Now, if g{y) < g{C) for y > C, then for M > we have 

Jm Jm M 



and 



Bm,2n 

Now, g'{y) is of the same sign as 



{M,A) < ec„,„^5((MM)2/™) for M > AC^/^. 



2(w- l)(n- 1) 
m ^ m 



2 \ . im 
fo2/)(i + — 



i)(^ 

y 



1 ) 



m 

2/ logy 



)• 



Hence, for n > 1 we have g'{y) < 0 as soon as j/ > 1 and (m — l)(n — 1) — 
m/logy > 0, hence as soon as y > exp(m/((m — l)(n — 1)). For n = 1, g'{y) 

which is of the same sign as — — fl + — which decreases as v 

increases, hence is less than 0 for y > 3ml2. 



Proposition 3. For n ^ {—2k; k G Z>o} a rational integer it holds that 

W(A) = F™(n/2)H-” + VRes,=_ 2 fefs ^ F™(s/2)— ) 

fc >0 ^ 

where this series is absolutely convergent, and for any integer M > 0 with M ^ 
— (n + l)/2 we have 



y] Ress^-2k 

k>M 






^ Cm 7r’”/2^2M+l 

- ^|2M + l + n|(M!/2)'« 



where 



dt 



Cm — 



/_oo (cosh(Tri))”^/^ 
is less than or equal to C 2 = 1 for m >2. Moreover, for m = 2 we obtain 

k 

finer A -I- "T 

2fc + n 



4 1 1 

i,Aa) = rnn/2)4- + E ('»8 -^ + 7 - ^ - E y)^- d^) 

fc >0 1 = 1 ' ’ 



Proof. It holds that 

, A-s . 1 p-2M-l+ioo 

V Res,^_ 2 fc(s ^ r™(s/2) ) = — / r™(s/2) 



k>M 



s — n 
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and 



, 2M + 1 

n — 



cosh(Trt) 



M 

n 

k=0 



2k + 1 



1-2 



■ it] 



/ 2 ,2 

< ( ) 

— ^ l\yf ’ 



M\ cosh(Trt) 



(see [Lou2, (32)]). 



Remark 1. If one does not know beforehand that C,k{s) / C,{s) is entire, then one 
can use the previous method starting with Ak{s) := s(s — l)A]^r'^{s/2)(^K{s) 
which is entire and satisfies the functional equation Ak(1 — s) = Ak{s). 



3 Tables 



Table 1. Some real cyclic cubic fields 



m 


Dm 


-2ia„,(-l) 


value in [KK] 


8 


97 


7-367 


7-367 


13 


217 = 7-31 


3^-7-13-37 


3-89- 113 


14 


247 = 13- 19 


3-7-19-109 


3-7-19-109 


22 


559 = 13-43 


3^-7-43-61 


71 - 6977 


29 


937 


2^ - 3 - 7^ - 3931 


2 - 3^ - 7 - 11 - 1667 


35 


1339 = 13 • 103 


3^-7 - 111091 


2'^ - 3 - 41 - 443 


40 


1729 = 7 - 13 • 19 


3® - 7 - 13 - 6073 


3 - 5 - 241 - 4127 



Table 2. Some non-normal totally real cubic fields 



m 


Dm 






4 


257 


1 


-2/3 


5 


697 = 17-41 


1 


-8/3 


6 


1489 


1 


-8 


7 


2777 


2 


-24 


8 


4729 


1 


-44 


9 


7537 


2 


-96 


10 


11417 = 7^ - 233 


3 


-200 


11 


16609 = 17 - 977 


2 


-300 


12 


23377 = 97 - 241 


2 


-1516/3 


13 


32009 


3 


-896 


14 


42817 = 47-911 


3 


-1280 


15 


56137 = 73 - 769 


2 


-1856 


16 


72329 = 151 - 479 


6 


-3168 


17 


91777 = 7^ - 1873 


3 


-3968 


18 


114889 


3 


-5376 


19 


142097 


6 


-8652 


20 


173857 = 23 - 7559 


5 


-10518 
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Abstract. Until recently, no Salem numbers were known of trace below 
— 1. In this paper we provide several examples of trace —2, including an 
explicit inhnite family. We establish that the minimal degree for a Salem 
number of trace —2 is 20, and exhibit all Salem numbers of degree 20 
and trace —2. Indeed there are just two examples. 

We also settle the closely-related question of the minimal degree d of 
a totally positive algebraic integer such that its trace is < 2d — 2. This 
minimal degree is 10, and there are exactly three conjugate sets of degree 
10 and trace 18. Their minimal polynomials enable us to prove that 
all except five conjugate sets of totally positive algebraic integers have 
absolute trace greater than 16/9. 

We end with a speculative section where we prove that, if a single poly- 
nomial with certain properties exists, then the trace problem for totally 
positive algebraic integers can be solved. 



1 Introduction 

A Salem number is a real algebraic integer greater than 1 whose other conju- 
gates all lie in the closed disc \z\ < 1, with at least one on the circle \z\ = 1. 
Salem numbers have a long history, making their appearance in surprisingly 
diverse areas of mathematics. These include: polynomials having small Mahler 
measure, questions concerning uniform distribution, harmonic analysis, dynami- 
cal systems, growth series of Coxeter groups, pretzel knots, and special values of 
L-functions. As detailed below, the problem of finding Salem numbers that have 
negative trace is closely connected to the old problem ([9] appeared in 1918) of 
finding totally positive algebraic integers of unusually small trace (in the sense 
that the trace divided by the degree, the absolute trace, is small). Until recently, 
no examples were known of Salem numbers having trace below —1. Here we 
settle the question: what is the smallest possible degree for a Salem number of 
trace —2? 

D.A. Buell (Ed.): ANTS 2004, LNCS 3076, pp. 327-337, 2004. 

© Springer- Verlag Berlin Heidelberg 2004 
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The problem is related to that of finding totally positive algebraic integers 
of degree d and trace 2d — 2. For suppose that 

f{x) = x'^ — {2d — 2)x‘^~^ + ■■■ 

is the minimal polynomial of a totally positive algebraic integer. Then we apply 
the transformation x = z + 1/0 + 2, and clear denominators, to produce a 
reciprocal polynomial 



F{z) = + • • • + 2z + 1 

which is the minimal polynomial of an algebraic integer of degree 2d and trace 
—2. The reverse transformation is a little more complicated: in F{z)/z'^, replace 
each + 1/z^ by Tj{x — 2), where Tj is the j-th Chebyshev polynomial, defined 
by Tj{z + ^/z) = z^ + 1/zb Any roots of f{x) in the interval 0 < x < 4 are 
mapped to pairs of roots of F{z) on the unit circle. Any roots of f{x) in the 
interval x > 4 are mapped to pairs of reciprocal real positive roots of F{z). We 
see that the problem of finding all Salem numbers of degree 2d and trace —2 is 
equivalent to that of finding all totally positive algebraic integers 9 of degree d 
and trace 2d — 2 such that both (i) 0 > 4; and (ii) all other conjugates of 9 are 
in the interval 0 < x < 4. 

The similar problem for trace —1 was settled some time ago: the smallest 
degree for a Salem number of trace —1 is 8, and there is just one such Salem 
number, having minimal polynomial 

z® + - z® - 4z® - 5z^ - 4z® - z^ + z + 1 . 

In [14] it is shown that there are infinitely many Salem numbers of trace — 1, 
with examples of degree 2d for every d > 4. At that time, no examples of trace 
below —1 were known. We now know (see [5]) that there are infinitely many 
Salem numbers of every trace. In this paper we give a simpler proof that there 
are infinitely many Salem numbers of trace —2, using techniques from [14]. 

Some examples of Salem numbers of trace —2 are given in the next section, 
including one of degree only 26. These examples were obtained using a graphical 
construction described in [6] (generalising that in [4]), and using an interlacing 
construction described in [5] (which is greatly generalised in [7]). Bounds ob- 
tained in [11] show that to achieve trace —2 the degree must be at least 18. 
Further computations, announced in [14], showed that to achieve trace —2 the 
degree must be at least 20. This is confirmed by the computations of Sect. 3. 
However, we need no longer rely on these computations, as a direct proof of this 
is given in Sect. 4. The gap between 20 and 26 seemed tantalisingly narrow, and 
an improved search algorithm, detailed below, was set to work on degree 20. 
Luckily for us, we did not need to go up to degree 22! There are two examples 
at degree 20, and their minimal polynomials are given in Table 1. 

The search in fact found all totally positive algebraic integers of degree 10 
and trace 18. There are just three conjugate sets of these, and their minimal 
polynomials are displayed in Table 2. The first two of these polynomials yield 
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Table 1. Minimal polynomials of the Salem numbers of degree 20 and trace —2 

^20 ^ 2^19 + - 15 ^ 1 ® - 18 ^ 1 ^ - 

-13z^° - 14z® - 16z® - 18«'^ - 18«® - 152® - 92^ - 3z^ + + 2z + 1 

z20 2^19 _ 82^'^ - 222^® - 402^® - 582^"^ - 742^® - 87z^^ - 962“ 

-992^° - 962® - 872® - 742’^ - 582® - 402® - 22z^ - 82® + 22 + 1 

Table 2. Minimal polynomials of the totally positive algebraic integers of degree 10 
and trace 18 



fi(x) = - 18a:® + 135a;® - 549x'^ + 1320x® - 1920a;® 

+1662X® - 813a;® + 206a;2 - 24a; + 1 

fzix) = X®® - 18x® + 134x® - 538x'^ + 1273x® - 1822x® 
+1560x^ - 766x® + 200x^ - 24x + 1 

faix) = x®° - 18x® + 134x® - 537x'^ + 1265x® - 1798x® 
+1526X® - 743x® + 194x^ - 24x + 1 



Salem numbers via the transformation x = 2 + 1/2 + 2; the third does not, since 
it has two roots greater than 4. 

In [11] a lower bound is given for the absolute trace of totally positive al- 
gebraic integers. Using all three of the polynomials in Table 2, we are able to 
improve this bound (Sect. 4). The paper then concludes with a speculative sec- 
tion on the trace problem for totally positive algebraic integers. 

2 Examples of Salem Numbers of Trace —2 

2.1 Examples from Graphs 

In [4], Salem numbers were constructed using star-like trees. It is known which 
star-like trees have exactly one eigenvalue A > 2, and for any such tree we define 
r > 1 by yT -I- l/^/r = A. Then r is a Salem number, unless A is a rational 
integer. It was shown in particular that for any integer r > 2, and any integers 
oi, . . . , Ur, all at least 2 (and excluding certain exceptional choices), the only 
solutions to the equation 







are a certain Salem number (or perhaps a reciprocal Pisot number), its conju- 
gates, and possibly some roots of unity. The corresponding star-like tree has oi, 
. . . , Qr vertices on its r arms. 

Applying the method in [3], if one takes r = 10, Oi = 390, 02 = 462, 03 = 
1190, 04 = 1938, 05 = 1995, og = 2090, 07 = 2805, og = 4641, og = 4862, and 
oio = 5005, one produces a Salem number of degree 23838 and trace —2. It might 




330 J. McKee and C. Smyth 



seem a daunting task to test a polynomial f{z) of degree 23838 for irreducibility. 
Luckily we need only check that no roots of unity are roots of f(z), and it is 
sufficient to test that 



gcd{f{z),f{-z)) = gcdifiz), f{z^)) = gcd{f{z),f{-z^)) = 1 , 

since if a; is a root of unity, then co is conjugate to one of —lo, 

The degree can be reduced greatly by exploiting other graphs. For example, 
by adding forks to the ends of some of the branches of a star-like tree, the Salem 
formula (1) can be generalised (again with some exclusions) to 



E 



- 1 

zO-i — 1 




Z&j-l + 1 

z^i + 1 




(2) 



Taking r = s = 3, oi = 66, 02 = 130, 03 = 238, 61 = 255, 62 = 273, and 63 = 385 
in (2), one obtains a Salem number of trace —2 and degree 1278. 

For more on producing Salem numbers from graphs, see [6]. The current 
record low degree for a Salem number r of trace —2 obtained from graphs is 
degree 460, with t being a root of 

^69 _i (^13 + 1)(^182_^) (2ll + l)(^220 _i) ^ 

(z-1)(z195 + 1) + (2-1)(z231 + 1) “ ' 



2.2 Examples via Interlacing 



Let Q{z) and P{z) be relatively prime polynomials with integer coefficients, and 
with all their roots on the unit circle. Suppose further that P{z) is monic, Q{z) 
has positive leading coefficient, and that the roots of P and Q interlace on the 
unit circle. This last condition means that as you progress clockwise around the 
unit circle, you encounter a zero of P and a zero of Q alternately. Finally we 
suppose that either P(l) = 0, or Q(l) = 0 and 2P(1) — Q'(l) < 0. Then part of 
Proposition 4 of [5] states that — l)P(z) — zQ{z) is the minimal polynomial 
of a Salem number (or perhaps a reciprocal Pisot number), possibly multiplied 
by a cyclotomic polynomial (i.e., a polynomial all of whose roots are roots of 
unity) . 

In the next section, we use this interlacing construction to produce an infinite 
family of Salem numbers of trace —2. The smallest degree of any member of this 
family is 38, with the Salem number being a root of 

- 1 z^^-1 z^4 - 1 _ 1 

(z2 - l)(z3 - 1) ^ (z5 - l)(z7 - 1) + (zll - l)(zl3 - 1) “ ^ “ Z ' 




Salem Numbers of Trace —2 



331 



The current record via interlacing is of degree only 26. Define polynomials 

P{z) = z24 + 4 _j23 9^22 j^ 5 ^ 2 i 21^20 + 26z^^ + 29z^^ + 29^1^ + 26zi® 

+21.J15 2^5^14 g^i3 _ 8 _jii _ 25^10 _ 21^9 - 26z® - 29z^ - 29z® 

-26^5 - 21^4 _ 15^3 _ 9^2 _ 4 ^ _ 2 ^ 

Q(z) = 2^24 + 7^23 24^22 + 21 z 2 i + 27^20 + 31^19 + 33^18 + 33^17 32^16 

+312^5 + 31^14 + 31013 + 31^12 + 322 I 1 + 3l0i° + 31 z9 + 3208 33^7 

+3308 32_^5 270^ + 2103 2402 + 7 ^ + 2 . 

Then P{z) and < 5 ( 0 ) satisfy all the required conditions, and the polynomial 
(02 — 1)F(0) — zQ{z) (which has trace —2) is in fact irreducible, and is the 
minimal polynomial of a Salem number of degree 26. For an explanation of the 
construction of this remarkable pair of polynomials, see [7]. 

2.3 Infinitely Many Examples 

For any integer p that is coprime to2-3-5-7-ll, clearing denominators in the 
equation 

(02 - 1)(03 - 1) + (z5 _ 2 )(z7 - 1) + ( 0 ll - 1)(0P 

gives a polynomial of trace —2. From Proposition 4 of [5], this is the minimal 
polynomial of a Salem number, possibly multiplied by a cyclotomic polynomial. 
We now show that in fact this polynomial is irreducible for all p > 11 coprime 
to 2 • 3 • 5 • 7 • 11, giving infinitely many examples of Salem numbers of trace —2. 
All that is required is to show that no root of unity satisfies (3). 

Putting y = z^, (3) reads h{z,y) = 0, where 

03-1 032-1 J/031-1 1 

~ (02 - 1)(03 - 1) + (z5 _ 2)(^7 _ 2) + (zll - l)(y - 1) “ ^ + z ' 

We again apply the trick that if a root of unity w satisfies (3), then so does 
one of — w, a;2, (See [1] for more applications of this idea.) Our restriction 
on p implies in particular that p is odd, and hence (— 

Eliminating 0 between h{z,y) = 0 and h{—z,—y) = 0 (by computing the 
resultant of the numerators of h{z, y) and h{—z, —y), treating 0 as the variable) 
yields 

{y^ + 1) V(y) = 0 , 

where f{y) has no cyclotomic factors. Eliminating y instead yields 

(03 - 0 ^ + 1 ) 5 ( 0 ) = 0 , 

where 5 ( 0 ) has no cyclotomic factors. If both 0 = 0 ; and 0 = —ui were to satisfy 
(3), then we would need 5 = to be a primitive fourth root of unity, with co a 
primitive twelfth root of unity. It would follow that p is divisible by 3. 
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Similarly, eliminating first y and then z between h{z, y) = 0 and h{z'^, y^) = 0 
yields that for both z = to and z = to satisfy (3) requires that w is a primitive 
second, third, fifth, seventh, or eleventh root of unity, and that = 1. It would 
follow that p is divisible by at least one of 2, 3, 5, 7, or 11. 

Finally, considering h{z,y) = 0 and h{—z'^,—y'^) = 0 simultaneously shows 
that for both z = oj and z = — to satisfy (3) requires that p is divisible by 3. 
We see that no roots of unity can satisfy (3) provided that p is prime to 

2 • 3 • 5 • 7 • 11. There are infinitely many such p, giving infinitely many Salem 
numbers of trace —2. 

3 Totally Positive Algebraic Integers of Given Degree 
and Trace 

3.1 The Old Search Algorithm 

For ease of exposition, we consider the search for totally positive algebraic in- 
tegers of degree 10 and trace 18. The algorithm clearly generalises to arbitrary 
degree and trace. 

We seek all positive integers 02, . . . , oio such that 

f{x) = — 18a:® -I- 02a:® — a^x'^ + 04a;® — a^x^ + a^x"^ — a^x^ + a^x^ — agx + oio 

has 10 distinct positive real roots. It is extremely difficult for a totally positive 
algebraic integer of degree 10 to have trace as small as 18, so having found all 
suitable a^, nearly all of the corresponding polynomials f(x) will be reducible. 

For 1 < z < 10, we let fi{x) be the (10 — z)th derivative of f{x). If f{x) has 10 
distinct positive roots, then for each z, fi{x) will have z distinct positive roots. 
We find all possibilities for f{x) by building up from below: we list all values of 
02 such that f 2 {x) has 2 distinct positive roots; for each suitable 02, we list all 
values of 03 such that fsix) has 3 distinct positive roots; and so on. Many of our 
candidates for the higher derivatives of f{x) do not survive this lifting process: 
we frequently find that for some fi{x) having i positive real roots there is no 
choice of o^+i that makes fi+i{x) have z -I- 1 positive roots. 

Given a candidate for fi{x), having i distinct positive roots, the technique of 
Robinson [8] (used also in [11]) to find all suitable values of a^+i was to observe 
that the real values of Oj+i such that fi+i{x) has z -I- 1 distinct positive roots 
form an interval (possibly empty), with endpoints determined by considering the 
values of /i+i(a;) at its local maxima and minima. Although much more efficient 
than a naive brute-force search, this method requires the computation of the 
roots of a huge number of polynomials, using floating-point arithmetic. 



3.2 The New Search Algorithm 

Our new algorithm still builds up f{x) from its derivatives, as in the previous 
section. But the endpoints of the interval for oz are determined in a different 




Salem Numbers of Trace —2 



333 



manner, removing the need for floating-point arithmetic, and hugely speeding 
the search. 

With notation as in the previous section, we suppose that we have a candidate 
for fi{x) having i distinct positive roots. We wish to identify the (possibly empty) 
range of values for Oj+i such that fi+i{x) has i -I- 1 distinct positive roots. For 
ease of notation, we put a = Oj+i. We observe that D{a), the discriminant of 
/i+i(x) (which is a polynomial in a, given that all higher coefficients have been 
selected), vanishes at the (real) endpoints of the desired interval for a. In fact 
all its roots are real: they are the values of (—1)* fi, where /3 is a root of fi. 

Indeed the required interval is marked by the middle two roots of D{a) (with 
the interpretation that if D has odd degree, then we take the middle zero and 
the one to the left of it). For when a = Oi+i is large and negative, /i+i has either 
one or two real roots (depending on the parity of i), and as a is increased the 
number of real roots of /i+i jumps by two as we pass each root of D{a). When 
a is large and positive, the number of real roots is either one or none. The only 
possible interval in which the number of real roots can be as large as f -I- 1 is 
that bounded by the middle two roots. For these roots of fi+i all to be positive 
we require also that a > 0. 

It might appear that the problem of root-finding has simply been transferred 
to a different polynomial, but note that we only need to find the roots of D(a) 
to the nearest integer. To this end, we take some crude initial approximations to 
the middle two roots, then refine these using Sturm sequences to pin the roots 
down to the nearest integer. (The initial approximation that we used was simply 
to try the endpoints of the previous interval.) Then a further Sturm sequence 
computation for a single value of in the interval will reveal whether or not 
fi+i has the full i + 1 real roots for all a^+i in the interval. We also require 
Oi+i > 0, to ensure that all these roots are positive. 

A further improvement is to use non-trivial lower bounds for the Oj, based 
on known lower bounds for the traces of totally positive algebraic integers given 
in [11] — see also Sect. 4. This prunes out many hopeless fi{x) with i small. 

The full search took 150 hours on a 1.2GHz PC, using PARI/GP, and pro- 
duced three irreducible polynomials of degree 10, trace 18, with 10 distinct posi- 
tive roots, as listed in Table 2. Some 4065 reducible polynomials of degree 10 and 
trace 18 were found. Studying the irreducible factors of this output provides the 
necessary information to find all Salem numbers of trace —1 and degree < 18, 
and also confirms that to achieve degree d and trace < 2d — 2 requires d > 10 
(this also follows from the result of the next section) . 

4 Improving the Lower Bound for the Absolute Trace of 
Totally Positive Algebraic Integers 

In [12] it was shown that all except five conjugate sets of totally positive alge- 
braic integers a have absolute (also called mean) trace tr(a)/deg(a) > 1.7719 
(tr(o;) and deg(a) being the trace and degree respectively). We can now use 
the three newly discovered polynomials of degree 10 to improve this bound to 
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1.778378 > 16/9. The proof employs the same method as [12]: semi-infinite linear 
programming is used to produce the following inequality, valid for all a: > 0 

X - .5455833645 log jccj - .4958676072 log jx - 1| - .05892353929 log |x - 2| 
-.1846627119 log \x^ - 3x -h 1| - .002613011520 log -4x + l\ 
-.008163503307 log jx^ - 4x -h 2| - .09063100904 log jx^ - 5x^ -h 6x - 1| 
-.018999142581og jx^ - 6x^ -h 9x - 1| - .008696349375 log jx^ - 6x^ -h 9x - 3| 
-.05794447530 log jx"* - 7x^ -h lOx^ - 7x -h 1| 

-.03510719518 log lx"* - 7x^ -h Ux^ - 8x -h 1| 

-.008492128216 log jx^ - 9x^ -h 28x^ - 35x^ -h 15x - 1| 
-.01082775244 log jx® - Ox"* -h 27x^ - 31x2 -h 12x - 1| 
-.0008908117930 log jx® - llx^ -h 43x^ - 72x^ -h 51x2 - 14x -h 1| 

-.005949580568 log |x^ - 13x^ -h 63x^ - 143x4 -h 158x3 - 80x2 
-. 008478368652 log I /i(x) I - .007206449910 log |/ 2 (x)| 
-.010190016341og|/3(x)| > 1.7783786 , 

(4) 

where /i(x), / 2 (x), / 3 (x) are the three degree 10 polynomials displayed in Table 
2. To prove the existence of the lower bound tr(a)/ deg(o;) > 1.7783786 for a 
totally positive nonexceptional a, we substitute for x each conjugate aj of a, and 
average. Then if the minimal polynomial of a does not appear in the inequality, 
we get that tr(a)/deg(a) > 1.7783786 -I- X) Cfc log |i?/c|, where the Ck are positive, 
and the Rk are nonzero integer resultants. Hence tr(a)/ deg(o;) > 1.7783786, as 
claimed. 

The exceptional a are those of absolute trace less than 1.7783786 whose min- 
imal polynomial does appear in the above inequality, namely a having minimal 
polynomial x — 1, x2 — 3x -I- 1, x^ — 5x2 ^4 _ qj. 

x4 — 7x3 _|_ ^^^2 _ gj, _|_ 

Note that, if d = deg(o;) and tr(a) <2d—2 then, as this inequality excludes 
the five exceptional polynomials, we must have 16d/9 < tr(a) < 2d — 2, so that 
d > 10. This confirms again the computation at the end of the previous section, 
and checks too that there are no totally positive algebraic integers of degree 9 
and trace 16. 



5 A Polynomial That Would Solve the Trace Problem 

5.1 Background 

The trace problem for totally positive algebraic integers (called the “Schur- 
Siegel-Smyth trace problem” by Peter Borwein in his very nice recent book 
[2]), is the following. 

Problem 1. Fix p < 2. Then show that all but finitely many totally positive 
algebraic integers fd have tr(/3)/ deg(/3) > p. 

Thus here /3 is a zero of an irreducible monic polynomial of degree deg(/3) 
with integer coefficients, whose roots are all positive, and whose sum is tr(/3). 
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In 1918 I. Schur [9] solved the problem for p < i/e = 1.6487 • • •. In 1943 C.L. 
Siegel [10] solved it for p < 1.737. In [11] (see also [12]) the problem was solved 
for p < 1.7719, while in the previous section we solve it for p < 1.7783786. In 
[13] it was shown that there was no inequality of the type (4) having a lower 
bound p for any p larger than 2 — 10“^^. Shortly afterwards J.-P. Serre (personal 
communication, see “Note added in proof” in [13]), showed that there was no 
such inequality for any p larger than 1.8983021 • • •. Here we present possible 
further evidence against this problem being solvable for all p < 2. We prove that 
the existence of a single polynomial / with properties given below would imply 
that the problem cannot be solved for p sufficiently close to 2. The result is, 
however, highly speculative, as such a polynomial may not exist! 



5.2 The Polynomial 

Suppose that / is a monic polynomial of degree at least 2 with integer coefficients 
and all positive distinct roots such that 

• between every pair of distinct roots of / there is an x with |/(x)| > 2 . 

Then we claim that the set of all totally positive algebraic integers contains 
infinitely many j3 whose absolute trace is no greater than tr(/)/deg(/). 

We now prove the claim. Let p > 2 be prime, uip a primitive p-th root of unity, 
and a = ujp + 1/wp, with conjugates and let Q be the minimal polynomial of 
a. Then 

F{x) = 

i 

is a polynomial of trace tr(/) deg(Q), degree deg(/) deg(Q) and, and so of abso- 
lute trace := tr(F)/ deg(F) = tr(/)/deg(/). Note that, as the ai are in (—2, 2), 
it is clear from the graph of / that all the roots of F are real, positive and 
distinct. Let (3 be any one of them. Then /(/3) = ai for some i, so that the field 
Q(/3) contains ai, and hence deg(/3) > \{p— 1). Let m(/3) be the absolute trace 
of (3. Then, taking the j3j to be the representatives of the conjugate sets of the 
roots of F, we have 



tr(^) = = ^rn{f3j)deg{(3j) 

3 3 



and so 

tr(F)/deg(F) = tr(/)/deg(/) = ^m{(3j)wj 

3 

where the weights Wj := deg(/3j)/ deg(F) sum to 1. Hence at least one of the 
m{f3j), m{f3^P'>) say, is at most tr(/)/ deg(/). 

Now let p — >■ oo through a sequence of primes. Then deg(/?!^P^) — >■ oo, so that 
there must be infinitely many different Thus the ( 3 ^F give the required 

algebraic integers. This proves the claim. 
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Now we see that if there is an / as above with also tr(/)/ deg(/) < 2, then 
the trace problem has no solution when p > tr(/)/ deg(/). 

We do not know whether such a polynomial / exists. Any such / clearly must 
have absolute trace tr(/)/deg(/) > 1.7783786. Indeed, its absolute trace must 
be larger than any lower bound of any inequality (which might be found in the 
future) of the type (4). For a monic integral polynomial with all positive roots 
and absolute trace less than 2, the crucial quantity is 

Mf := min (|/(0)|,|/(7i)|, 1 /( 72 )!,..., |/(7n-i)|) , 

where n is the degree of /, and 71,72, . • . , 7 n-i are the roots of /'. We need 
Mf > 2 . 

There are many variants on this idea. For example, if / is a monic polynomial 
of degree at least 3 with integer coefficients such that between every pair of 
distinct roots of / there is an x with |/(x)/a;| > 2, then again one can show that 
the set of all totally positive algebraic integers contains infinitely many f3 whose 
absolute trace is no greater than tr(/)/ deg(/). The argument is entirely similar, 
but using the polynomial F{x) = n(/(^) ~ cax). 
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Abstract. Most of the interesting algorithmic problems in the geometry 
of numbers are NP-hard as the lattice dimension increases. This article 
deals with the low-dimensional case. We study a greedy lattice basis re- 
duction algorithm for the Fuclidean norm, which is arguably the most 
natural lattice basis reduction algorithm, because it is a straightforward 
generalization of the well-known two-dimensional Gaussian algorithm. 
Our results are two-fold. From a mathematical point of view, we show 
that up to dimension four, the output of the greedy algorithm is optimal; 
the output basis reaches all the successive minima of the lattice. However, 
as soon as the lattice dimension is strictly higher than four, the output 
basis may not even reach the first minimum. More importantly, from a 
computational point of view, we show that up to dimension four, the 
bit-complexity of the greedy algorithm is quadratic without fast integer 
arithmetic: this allows to compute various lattice problems {e.g. comput- 
ing a Minkowski-reduced basis and a closest vector) in quadratic time, 
without fast integer arithmetic, up to dimension four, while all other 
algorithms known for such problems have a bit-complexity which is at 
least cubic. This was already proved by Semaev up to dimension three 
using rather technical means, but it was previously unknown whether or 
not the algorithm was still polynomial in dimension four. Our analysis, 
based on geometric properties of low-dimensional lattices and in partic- 
ular Vorono'i cells, arguably simplifies Semaev’s analysis in dimensions 
two and three, unifies the cases of dimensions two, three and four, but 
breaks down in dimension five. 



1 Introduction 

A lattice is a discrete subgroup of K”. Any lattice L has a lattice basis, i.e. a set 
{6i, . . . , bd} of linearly independent vectors such that the lattice is the set of all 
integer linear combinations of the bi’s: L[bi, . . . ,bd] = G z| . A 

lattice basis is usually not unique, but all the bases have the same number of 
elements, called the dimension or rank of the lattice. In dimension higher than 

D.A. Buell (Ed.): ANTS 2004, LNCS 3076, pp. 338-357, 2004. 
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one, there are infinitely many bases, but some are more interesting than others: 
they are called reduced. Roughly speaking, a reduced basis is a basis made of rea- 
sonably short vectors which are almost orthogonal. Finding good reduced bases 
has proved invaluable in many fields of computer science and mathematics, par- 
ticularly in cryptology (see for instance the survey [18]); and the computational 
complexity of lattice problems has attracted considerable attention in the past 
few years (see for instance the book [16]), following Ajtai’s discovery [1] of a 
connection between the worst-case and average-case hardness of certain lattice 
problems. 

There exist many different notions of reduction, such as those of Hermite [8] , 
Minkowski [17], Korkine-Zolotarev (KZ) [9,11], Venkov [19], Lenstra-Lenstra- 
Lovasz [13], etc. Among these, the most intuitive one is perhaps Minkowski’s 
reduction; and up to dimension four, it is arguably optimal compared to all other 
known reductions, because it reaches all the so-called successive minima. How- 
ever, finding a Minkowski-reduced basis or a KZ-reduced basis is NP-hard under 
randomized reductions as the dimension increases, because such bases contain a 
shortest lattice vector, and the shortest vector problem is NP-hard under ran- 
domized reductions [2,15]. In order to better understand lattice reduction, it is 
tempting to study the low-dimensional case. Improvements in low-dimensional 
lattice reduction may lead to significant running-time improvements in high- 
dimensional lattice reduction, as the best lattice reduction algorithms known in 
theory and in practice for high-dimensional lattices, namely Schnorr’s blockwise 
KZ-reduction [20] and its heuristic variants [21,22], are based on a repeated use 
of low-dimensional KZ-reduction. 

The classical Gaussian algorithm [5] computes in quadratic time (without 
fast integer arithmetic [23]) a Minkowski-reduced basis of any two-dimensional 
lattice. This algorithm was extended to dimension three by Vallee [27] in 1986 
and Semaev [24] in 2001: Semaev’s algorithm is quadratic without fast integer 
arithmetic, whereas Vallee’s algorithm has cubic complexity. More generally, Hel- 
frich [7] showed in 1986 by means of the LLL algorithm [13] how to compute in 
cubic time a Minkowski-reduced basis of any lattice of fixed (arbitrary) dimen- 
sion, but the hidden complexity constant grows very fast with the dimension. 

In this paper, we generalize the Gaussian algorithm to arbitrary dimension. 
Although the obtained greedy algorithm is arguably the simplest lattice basis 
reduction algorithm known, its analysis becomes remarkably more and more 
complex as the dimension increases. Semaev [24] was the first to prove that the 
algorithm was still polynomial-time in dimension three, but the polynomial-time 
complexity remained open for higher dimension. We show that up to dimension 
four, the greedy algorithm computes a Minkowski-reduced basis in quadratic 
time without fast arithmetic. This implies that a shortest vector and a KZ- 
reduced basis can be computed in quadratic time up to dimension four. Inde- 
pendently of the running time improvement, we hope our analysis will help to 
design new lattice reduction algorithms. The main novelty of our approach com- 
pared to previous work is that we use geometrical properties of low-dimensional 
lattices. In dimension two, the method is very close to the argument given by 
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Semaev in [24] , which is itself very different from previous analyses of the Gaus- 
sian algorithm [12,28,10]. In dimension three, Semaev’s analysis [24] is based on 
a rather exhaustive analysis of all the possible behaviors of the algorithm, which 
involves quite a few computations and makes it difficult to extend to higher di- 
mension. We replace the main technical arguments by geometrical considerations 
on two-dimensional lattices. This makes it possible to extend the analysis to di- 
mension four, by carefully studying geometrical properties of three-dimensional 
lattices, although a few additional difficulties appear. However, it is still un- 
known whether or not the greedy algorithm remains polynomial-time beyond 
dimension four. Besides, we show that the output basis may not even reach the 
shortest vector beyond dimension four. 

The paper is organized as follows. In Section 2, we recall useful facts about 
lattices. In Section 3, we recall Gauss’ algorithm and describe its natural 
greedy generalization. In Section 4, we analyze Gauss’ algorithm and extend the 
analysis to dimensions three and four, using geometrical results. We explain why 
our analysis breaks down in dimension five. In Section 5, we prove geometrical 
results on low-dimensional lattices which are useful to prove the so-called gap 
lemmata, an essential ingredient of the complexity analysis of Section 4. 

Important remark: Due to lack of space, this extended abstract contains few 
proofs, and we only show that the algorithm is polynomial-time. The proof of 
the quadratic complexity will appear in the full version of the paper. 
Notations: ]].]] and (., .) denote respectively the Euclidean norm and inner 

product of K"; variables in bold are vectors; whenever the notation [bi, . . . , bj] is 
used, we have 1 1 bi 1 1 < ... < 1 1 bn 1 1 and in that case we say that the bi ’s are ordered. 
Besides, the complexity model we use is the RAM model, and the computational 
cost is measured in elementary operations on bits. In any complexity statement, 
we assume that the underlying lattice L is integral (L C Z"). If x G K, then [x] 
denotes a nearest integer to x. 

2 Preliminaries 

We assume the reader is familiar with geometry of numbers (see [4,6,14,25]). 
Gram-Schmidt orthogonalization. Let bi,...,bd be vectors. The Gram 
matrix G{b\, . . . ,bd) of bi,...,bd is the dx d matrix {{bi,bj))i<ij<d formed 
by all the inner products. b\, . . . ,bd are linearly independent if and only if the 
determinant of G{bi, . . . ,bd) is not zero. The volume vol(L) of a lattice L is 
the square root of the determinant of the Gram matrix of any basis of L. The 
orthogonality-defect of a basis (bi, . . . ,bd) of L is defined as (Ot^i 
it is always greater than 1, with equality if and only if the basis is orthogonal. 
Let (bi, . . . , bd) be linearly independent vectors. The Gram-Schmidt orthogonal- 
ization (b*, . . . , bj) is defined as follows: b* is the component of bi orthogonal to 
the subspace spanned by bi, . . . , bi_i. 

Successive minima and Minkowski reduction. Let L be a d-dimensional 
lattice in K". For 1 < i < d, the i-th minimum Xi{L) is the radius of the smallest 
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closed ball centered at the origin containing at least i linearly independent lattice 
vectors. The most famous lattice problem is the shortest vector problem (SVP): 
given a basis of a lattice L, find a lattice vector of norm Ai(L). There always 
exist linearly independent lattice vectors d^’s such that ||vi|| = Xi{L) for all i. 
Surprisingly, as soon as c? > 4, such vectors do not necessarily form a lattice 
basis, and when d > 5, there may not even exist a lattice basis reaching all 
the minima. A basis [bi, . . . , bd] of L is Minkowski-reduced if for all 1 < f < d, 
bi has minimal norm among all lattice vectors bi such that [bi , . . . , bi] can be 
extended to a basis of L. Surprisingly, up to dimension six, one can easily decide 
if a given basis is Minkowski-reduced or not, by checking a small number of 
explicit norm inequalities, known as Minkowski’s conditions. A basis reaching 
all the minima must be Minkowski-reduced, but a Minkowski-reduced basis may 
not reach all the minima, except the first four ones (see [29]): if [6i, . . . , bd] is a 
Minkowski-reduced basis of L, then for all 1 < f < min(d, 4), ||bi|l = Xi{L). Thus, 
a Minkowski-reduced basis is optimal in a natural sense up to dimension four. 
A classical result (see [29]) states that the orthogonality-defect of a Minkowski- 
reduced basis can be upper-bounded by a constant which only depends on the 
lattice dimension. 

Voronoi cell and Voronoi vectors. The Voronoi cell [30] of L = L[bi, . . . ,bd], 
denoted by Vor(bi, . . . , bd), is the set of vectors x in the linear span of L which 
are closer to 0 than to any other lattice vector: for all v G L, |ja; — 1 >|[ > |]a;||, that 
is llt>ll^ > 2\{v,x)\. The Voronoi cell is a finite polytope which tiles the linear span 
of L by translations by lattice vectors. We extend the notation Vor(6i, . . . , b^) to 
the case where the first vectors may be zero (the remaining vectors being linearly 
independent): Vor(bi, . . . , bd) denotes the Voronoi cell of the lattice spanned by 
the non-zero b^’s. A lattice vector v G Lis called a Voronoi vector if v/2 belongs 
to the Voronoi cell (in which case v/2 will be on the boundary of the Voronoi 
cell). V G Lisa, strict Voronoi vector if v/2 is contained in the interior of a (d— 1)- 
dimensional face of the Voronoi cell. A classical result states that Voronoi vectors 
correspond to the minima of the cosets of L/2L. We say that (aii, . . . , Xd) G 
is a possible Voronoi coord if there exists a Minkowski-reduced basis [bi, . . . , bd] 
such that xibi -I- . . . -I- Xdbd is a Voronoi vector. In some parts of the article, we 
will deal with Voronoi coordinates with respect to other types of reduced bases: 
the kind of reduction considered will be clear from the context. The covering 
radius p{L) of a lattice L is half of the diameter of the Voronoi cell. The closest 
vector problem (CVP) is a non-homogeneous version of SVP: given a basis of a 
lattice and an arbitrary vector x of K", find a lattice vector v minimizing the 
distance |ju — a;||; in other words, if y denotes the orthogonal projection of x 
over the linear span of L, find v G L such that v — y belongs to the Voronoi cell 
of L. 



3 A Greedy Generalization of Ganss’ Algorithm 

In dimension two, there is a simple and efficient lattice basis reduction algo- 
rithm due to Gauss. We view Gauss’ algorithm as a greedy algorithm based on 




342 P.Q. Nguyen and D. Stehle 



Fig. 1. Gauss’ algorithm. 



Input: A basis [u, u] with its Gram matrix. 

Output: A reduced basis of L[u,v], together with its Gram matrix. 

1. Repeat 

2. r := V — XU where x := j|"’|p j , 

3. V u, 

4. u := r, 

5. Update the Gram matrix of (u,v), 

6. Until ||u|| > ||u||. 

7. Return [u, u] and its Gram matrix. 



the one-dimensional CVP, which suggests a natural generalization to arbitrary 
dimension that we call the greedy reduction algorithm. We study properties of 
the bases output by the greedy algorithm by defining a new type of reduction 
and comparing it to Minkowski reduction. 



3.1 Gauss’ Algorithm 



Gauss’ algorithm - described in Figure 1 - can be seen as a two-dimensional 
generalization of the centered Euclidean algorithm [28]. At Step 2 of each loop 
iteration, u is shorter than v, and one would like to shorten v rather than 
u, while preserving the fact that \u, u] is a basis of L. This can be achieved by 
subtracting to u a multiple xu of u, because such a transformation is unimodular. 
The optimal choice is when xu is the closest vector to u, in the one-dimensional 



lattice spanned by u, which gives rise to x := 



(u.v) 



. The values {u, v) and 



are extracted from G{u,v), which is updated at Step 5 of each loop iteration. 
The complexity of Gauss’ algorithm is given by the following classical result: 



Theorem 1. Given as input a basis [u,v] of a lattice L, Gauss’ algorithm out- 
puts a Minkowski-reduced basis of L in time 0(log ||u|| • [l-|-log ||u|| — log Ai(L)]). 

Note that this result is not trivial to prove. It is not even clear a priori why 
Gauss’ algorithm outputs a Minkowski-reduced basis. 



3.2 The Greedy Reduction Algorithm 

Gauss’ algorithm suggests the general greedy algorithm described in Figure 2, 
which uses reduction and closest vectors in dimension d — 1, to reduce bases 
in dimension d. We make a few remarks on the description of the algorithm. 
If the Gram matrix is not given, we may compute it in time 0(log^ ll^dH) for 
fixed d. The algorithm updates the Gram matrix each time the basis changes. 
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Name: Greedy(bi, . . . , b^). 

Input: A basis [6i, . . . , bd] together with its Gram matrix. 

Output: An ordered greedy-reduced basis of L[bi , . . . ,bd] with its Gram matrix. 

1. If d = 1, return bi. 

2. Repeat 

3. Order (bi, . . . , bd) by increasing lengths and update the Gram matrix, 

4. [bi, . . . , bd-i] ■— Greedy(bi, . . . , bd-i), 

5. Compute a vector c closest to bd, in L[bi, . . . , bd-i], 

6. bd bd — c and update the Gram matrix, 

7. Until ||bd|| > ||bd-i||. 

8. Return [bi, . . . , bd] and its Gram matrix. 



Fig. 2. The greedy lattice basis reduction algorithm in dimension d. 



Step 3 is easy: if this is the first iteration of the loop, the basis is already or- 
dered; otherwise, [bi, . . . , b^-i] is already ordered, and only has to be inserted 
among bi, . . . ,bd-i- At Step 4, the greedy algorithm calls itself recursively in 
dimension d — 1: G(bi, . . . , b^-i) does not need to be computed before calling 
the algorithm, since G(bi, . . . , bd) is already known. At this point, we do not ex- 
plain how Step 5 (the computation of closest vectors) is performed: this issue is 
postponed to subsection 3.4. Note that for d = 2, the greedy algorithm is exactly 
Gauss’ algorithm. From a geometrical point of view, the goal of Steps 5 and 6 
is to make sure that the orthogonal projection of bd over the lattice spanned by 
[bi, . . . , bd-i] lies in the Vorono'i cell of that lattice. 

An easy proof by induction on d shows that the algorithm terminates. Indeed, 
the new vector bd of Step 6 is strictly shorter than bd-i if the loop does not 
end at Step 7. Thus the product of the norms of the b^’s decreases strictly at 
each iteration of the loop which is not the last one. But for all B, the number 
of lattice vectors of norm less than B is finite, which completes the proof. 

Although the description of the greedy algorithm is fairly simple, analyzing 
its bit complexity seems very difficult. Even the two-dimensional case of the 
Gaussian algorithm is not trivial. 

3.3 Greedy Reduction 

In this subsection, we study properties of the bases output by the greedy algo- 
rithm. As previously mentioned, it is not clear why Gauss’ algorithm outputs a 
Minkowski-reduced basis. But it is obvious that the output basis [tt, u] satisfies: 
Ill'll < ||u|| < ||u — a;tt|| for all x G Z. This suggests the following definition: 

Definition 2. An ordered basis [bi, . . . , bd] is greedy-reduced if for all2 <i < d 
and for all Xi, . . . ,Xi-i G Z: ||bi|| < \\bi + Xibi -|- . . . -I- Xi-ibi_i\\. 

In other words, we have the following recursive definition: a one-dimensional ba- 
sis is always greedy-reduced, and an ordered basis [bi, . . . , bd] is greedy-reduced if 
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and only if [bi, . . . , bd-i] is greedy-reduced and the projection of bd over the lin- 
ear span of bi, . . . , bd-i lies in the Voronoi cell Vor(bi, . . . , bd-i). The greedy al- 
gorithm outputs a greedy-reduced basis, and if the input basis is greedy-reduced, 
the output basis will be equal to the input basis. 

The fact that Gauss’ algorithm outputs Minkowski-reduced bases is a particu- 
lar case of the following result, which compares greedy-reduction with Minkowski 
reduction: 

Lemma 3. The following statements hold: 

1. Any Minkowski-reduced basis is greedy-reduced. 

2. A basis of d < 4 vectors is Minkowski-reduced if and only if it is greedy- 
reduced. 

3. If d > 5, there exists a basis of d vectors which is greedy-reduced, but not 
Minkowski-reduced. 

As a consequence, the greedy algorithm outputs a Minkowski-reduced basis up 
to dimension four, thus reaching all the successive minima of the lattice; but 
beyond dimension four, the greedy algorithm outputs a greedy-reduced basis 
which may not be Minkowski-reduced. The following lemma shows that greedy- 
reduced bases may considerably differ from Minkowski-reduced bases beyond 
dimension four: 

Lemma 4. Let d> 5. For all e: > 0, there exists a lattice L and a greedy-reduced 
basis [bi, . . . , b^] of L such that: Ai(T)/||bi|| < e: and vol(L)/ ||bi|| < e. 

Such properties do not hold for Minkowski-reduced bases. The first phenomenon 
shows that greedy-reduced bases may be arbitrarily far from the first minimum, 
while the second one shows that a greedy-reduced basis may be far from being 
orthogonal. 

3.4 Computing Closest Vectors from Minkowski-Reduced Bases 

We now explain how Step 5 of the greedy algorithm can be implemented effi- 
ciently up to d = 5. Step 5 is trivial only when d < 2. Otherwise, note that 
after Step 4, the {d — l)-dimensional basis [bi, . . . , bd-i] is greedy-reduced, and 
therefore Minkowski-reduced as long as d < 5. And we know the Gram matrix 
of [bi, . . . , bd-i,bd]. 

Theorem 5. Let d > 1 be an integer. There exists an algorithm which, given 
as input a Minkowski-reduced basis [bi, . . . ,bd-i], a target vector t longer than 
all the bi’s, and the Gram matrix of [bi, . . . ,bd-i,t], outputs a closest lattice 
vector ctot (in the lattice spanned by the [bi, . . . , bd-i\), and the Gram matrix of 
{bi, ... , bd-i,t — c), in time 0(log ||t|| • [1-1- log ||t|| — log ||bc,||]), where 1 < a <d 
is any integer such that [bi, . . . , ba-\,t] is Minkowski-reduced. 

Intuitively, the algorithm works as follows: an approximation of the coor- 
dinates (with respect to the bfs) of the closest vector is computed using linear 
algebra, and the approximation is then corrected by a suitable exhaustive search. 
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Let h be the orthogonal projection of t over the linear span of bi, . . . , bd-i- 
There exist yi,. ■ ■ ,yd-i G R such that h — Vi^i- If c = X)i=i is a 

closest vector to t, then h — c belongs to Vor(bi, . . . ,bd-i). However, for any 
C > 0, the coordinates (with respect to any basis of orthogonality-defect < C) 
of any point inside the Vorono'i cell can be bounded independently from the 
lattice (see [26]). It follows that if we know an approximation of the j/i’s with 
sufficient precision, then c can be derived from a 0(1) exhaustive search, since 
the coordinates yi — Xi oi h — c are bounded, and so is the orthogonality-defect 
of a Minkowski-reduced basis. 

To approximate the yi's, we use linear algebra. Let G = G(bi, . . . , bd-\) and 

^ = ( totf) ■ 

V ) i<ij<d-i 





yi 








yi 
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We use the latter formula to compute the j/i’s with a one-bit accuracy, in the ex- 
pected time. Let r = max^flog ■ Notice that r = 0(1-1- log ||t||— log ||bQ,_i||) 
by bounding (b^, t) depending on whether i > a. Notice also that the entries of H 
are all < 1 in absolute value (because [bi, . . . ,bd-\] is Minkowski-reduced), and 
det(iJ) is lower bounded by some universal constant (because the orthogonality- 
defect of [bi, . . . , bd-i] is bounded). It follows that one can compute the entries 
of H~^ with a f?(r)-bit accuracy, in O(r^) binary operations. One eventually 
derives the yi's with a one-bit accuracy, 

4 Complexity Analysis of the Greedy Algorithm 

4.1 A Geometric Analysis of Gauss’ Algorithm 

We provide yet another proof of the classical result that Gauss’ algorithm has 
quadratic complexity. Compared to other proofs, our method closely resembles 
the recent one of Semaev [24], itself relatively different from [12,28,10,3]. The 
analysis is not optimal (as opposed to [28]), but its basic strategy can be extended 
up to dimension four. Consider the value of x at Step 2: 

— If a; = 0, this must be the last iteration of the loop. 

— If \x\ = 1, there are two cases: 

• If jji; — xu\\ > j]ttj|, then this is the last loop iteration. 

• Otherwise, the inequality can be rewritten as jjtt — xv\\ < 1]mH, which 
means that u can be shortened with the help of v, which can only happen 
if this is the first loop iteration, because of the greedy strategy. 

— Otherwise, jxj > 2, which implies that xu is not a Voronoi' vector of the 
lattice spanned by u. Intuitively, this means that xu is far away from Vor(tt), 
so that V — xu is considerably shorter than v. More precisely, one can show 
that l[r^ll^ > [Jf — xu\\“^ + 2jjMjj^, which is therefore > 3jjrJ — if this is 
not the last loop iteration. 
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This shows that the product of the basis vectors norms decreases by a factor at 
least -\/3 every loop iteration except possibly the first and last ones. Thus, the 
number r of loop iterations is bounded by: r < 2 + log^ ||i)|| — log^ Ai(L). 

It remains to estimate the cost of each Step 2, which is the cost of comput- 
ing X. Because one can see that the bit complexity of Step 2 is 

0(log ||t>|| • [1 -I- log ||v|| — log ||m||]). If we denote by Ui and the values of u and 
V at the i-th iteration, then = Ui and we obtain that the bit complexity of 
Gauss’ algorithm is bounded by: 

o (EI =1 log ||v*|l • [1 -f log ||v,|| - log llttill]) 

= o(iogiivii • Er=i[i+iogii^’*ii -iogii^>*+iii]) 
= 0(logjjr;jj • [r-glog||t>|| -logAi(L)]) . 

This completes the proof of Theorem 1. 

4.2 A Geometric Analysis up to Dimension Four 

The main result of the paper is the following: 

Theorem 6. Let 1 < d < 4. Given as input an ordered basis [bi, . . . ,bj], 
the greedy algorithm of Figure 2 based on the algorithm of Theorem 5 out- 
puts a Minkowski-reduced basis of L[bi, . . . ,bd], using a number of bit operations 
bounded by 0(log||6d|| • [1 + log ||bd|| -logAi(L)]). 

However, due to lack of space, we only prove the following weaker result in this 
extended abstract: 

Theorem 7. Let 1 < d < 4. Given as input an ordered basis [bi, . . . ,bj\, 
the greedy algorithm of Figure 2 based on the algorithm of Theorem 5 out- 
puts a Minkowski-reduced basis of L[bi, . . . ^bj], using a number of bit operations 
bounded by a polynomial in log ||bd||- 

The result asserts the polynomial-time complexity of the greedy algorithm, which 
is by far the hardest part of Theorem 6. Both theorems are proved iteratively: the 
case d = 4 is based on the case d = 3, which is itself based on the case d = 2. The 
analysis of Gauss’ algorithm (Section 4.1) was based on the fact that if |a;| > 2, 
XU is far away from the Voronoi cell of the lattice spanned by u. The proof of 
Theorem 7 relies on a similar phenomenon in dimensions two and three. However, 
the situation is considerably more complex, as the following basic remarks hint: 

— For d = 2, we considered the value of x, but if d > 3, there will be several 
coefficients instead of a single x, and it is not clear which coefficient will be 
useful in the analysis. 

— For d = 2, Step 4 cannot change the basis, as there are only two bases in 
dimension one. If d > 3, Step 4 may completely change the vectors, and it 
could be hard to keep track of what is going on. 

In order to prove Theorem 7, we introduce a few notations. Gonsider the 
z-th loop iteration. Let [a), . . . , denote the basis [bi, . . . , bd] at the beginning 
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of the Lth loop iteration. The basis [a^, . . . , becomes [b\,..., with 

ll^ill < • • • < ll^d-ill after Step 4, and {b\, . . . ,6^) after Step 6, where = a^ — 
c* and c* is the closest vector c found at Step 5. Let pt be the number of integers 
1 < J < c? such that ||6* || < ||b^||. Let tt^ be the rank of 6^ once {b\, . . . , 6^) is 
sorted by length: for example, TTi = 1 if ||b^|| < ||bi|l. Clearly, I < ni < pi < d, if 
Pi = d then the loop terminates, and otherwise = ||oip’^^||- Note that TTi 

may not be equal to pi because there may be several choices when sorting the 
vectors by length in case of equalities. 

Now consider the {i + l)-th loop iteration for some i > 1. Recall that by 
definition of tt*, we have = b^^ = — c\ while = [bj]i<j<d-i- 

The closest vector belongs to L[b ]^^ , . . . , b’j^^] = L[a ]^^ , . . . , there 

exist integers , . . . , such that 



Suppose we know that Theorem 7 is correct in dimension d — 1 with 
2 < d < 4; we are to prove that it is still valid in dimension d. Because of 
this induction hypothesis and of Theorem 5, the number of bit operations 
performed in the t-th loop iteration is bounded by a polynomial in log||a^||. 
Since ||a^|| < lla^ll for any i, it is sufficient to prove that the number of 
loop iterations is bounded by a polynomial in log||a;^||. Indeed, we show that 
there exist a universal constant Cd > f such that for any execution of the 
d-dimensional greedy algorithm, in any d consecutive iterations of the loop, the 
product of the lengths of the current vectors decreases by some factor higher 
than Crf: 



iiaiii-ii°i,ii 

iiai+‘'ii-iiad+‘'ii 



>Cd. 



( 1 ) 



This automatically ensures that the number of loop iterations is at most pro- 
portional to log||a^||, and that the total number of bit operations is bounded 
by a polynomial in log ||ci;^||. 



We deal with the first difficulty mentioned: which coefficient will be useful? 
The trick is to consider the value of that is, the coefficient of = a^ — c* 
in and to use the greedy properties of the algorithm. 

Lemma 8. Among d consecutive iterations of the loop of the greedy algorithm 
of Figure 2, there is at least one iteration of index i + 1 such that Pi+i < Pi- 
Moreover, for such a loop iteration, we have > 2. 

Proof. The first statement is obvious. Consider one such loop iteration i -|- 1. 
Suppose we have a small that is = 0 or = 1. 

— If xff^^ = 0, G = L[b\, • ■ • , ^^- 2 ]- We claim that the 

{i + l)-th iteration must be the last one. Because the i-th loop iteration 
was not terminal, we have = b^d-i- Moreover, [b\,..., b^d-i\ is greedy- 
reduced because of Step 4 of the i-th loop iteration. These two facts imply 
that must be zero, and the {i + l)-th loop iteration is the last one. 
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— If = 1, we claim that pi+\ > pi. We have 

where and [a _i. Thus, c*+^ can be 

written as = ±(a^ — c*) + e where e G L[b\, . . . , b^_ 2 ]. Therefore 

± (a^ — c®) + e. In other words, — c*+^|| = ||a^ — /|| 

for some f G L[b \, . . . , It follows that pi+i > I + Pi, which achieves 

the claim. 

□ 

We will see that in dimension three, any such loop iteration i + 1 implies that 
at least one of the basis vectors significantly decreases in the (z + l)-th loop 
iteration, or had significantly decreased in the z-th loop iteration. This is only 
“almost” true in dimension four: fortunately, we will be able to isolate the bad 
cases, and to show that when a bad case occurs, the number of remaining loop 
iterations can be bounded by some universal constant. 

We now deal with the second difficulty. Recall that 
but the basis ^ • ■ • , is not necessarily greedy-reduced. We distinguish 

two cases: 

~ 1) The basis . . . , is somehow far from being greedy-reduced. 

Then 6^ was significantly shorter than a^. Note that this length decrease 
concerns the z-th loop iteration and not the (z -I- l)-th. 

— 2) Otherwise, the basis ^ , a^-i] almost greedy-reduced. The fact 

that > 2 roughly implies that is somewhat far away from the 

Voronoi cell Vor(a^^^, . . . , this phenomenon will be precisely captured 

by the so-called Gap Lemma. When this is the case, the new vector will 
be significantly shorter than ■ 

To capture the property that a set of vectors is almost greedy-reduced, we in- 
troduce the so-called e -reduction where e > 0, which is defined as follows: 

Definition 9. A single vector bi is always e-reduced; for d > 2, a d-tuple 
{bi , . . . , bd) is s-reduced if (bi, . . . , bd-i) is e-reduced, ||bd_i|| < ||bd||, and the 
orthogonal projection of b^ over the linear span of (6i, . . . , b^-i) belongs to 
{l-\-e)Vor{bi,...,bd-i). 

With this definition, a greedy-reduced basis is e-reduced for any e > 0. In the 
definition of e-reduction, we did not assume that the bfs were nonzero nor 
linearly independent. This is because the Gap Lemma is essentially based on 
compactness properties: the set of e-reduced d-tuples needs to be closed (from a 
topological point of view), while a limit of bases may not be a basis. 

We can now give the precise statements of the two cases described above. 
Lemma 10 corresponds to case 1), and Lemma 11 to case 2). 

Lemma 10. Let 2 < d < 4. There exists a constant £i > 0 such that for any 
£ < £\ there exists Cg > 1 such that the following statement holds. Consider the 
(i -\- l)-th loop iteration of an execution of the d-dimensional greedy algorithm, 
is not £-reduced, then ||a(;|| > C'e||6(;||. 
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Lemma 11. Let 2 < d < 4. There exist two constants £2 > 0 and D > 0 such 
that the following statement holds. Consider the {i + l)-th loop iteration of an 
execution of the d-dimensional greedy algorithm. Suppose that . . . , 

is £ 2 -reduced, and that > (1 — £ 2 )|| 0 'd''^|| for some 1 < k < d — 1. Then, 

if \xk\ > 2 and if we are not in the 211-case, we have: 

\K^^r + D\\K^^\\^<\K+Y, 

where the 211-case is: d = 4, \xk\ = 2. and the other \xj\ ’s are all equal to 1. 

This last lemma is a direct consequence of Pythagore and the Gap Lemma (which 
is crucial to our analysis, and to which the next section is devoted): 

Theorem 12 (Gap Lemma). Let 2 < d < 4. There exist two constants £2 > 0 
and D > 0 such that the following statement holds. Let [ai, . . . ,ad-i] he e- 
reduced vectors, u he a vector of Vor{ai, . . . , a^-i) and x\, . . . , Xd-i he integers. 
U llofell > (1 ~ ^)||<id-i|| for some k < d — 2, then: 



d-l 

||m|P -L L'llbfelP < ||m + 

t=i 

where \xk\ > 2, and if d = 4 the two other \xj \ ’s are not all equal to 1. 

This completes the overall description of the proof of Theorem 7. Indeed, 
choose three constants e, D > 0 and C > 1 such that we can apply Lemmata 10 
and 11. We prove that Equation (1) holds for Cd = min(C, Vl -I- D, j^) > 1. 
Consider a loop iteration i -\- 1 such that Pi+i < Pi- Recall that among any d 
consecutive iterations of the loop, there is at least one such iteration. For such 
an iteration, we have \x'ff^ \ > 2. We distinguish four cases: 

— . . . , is not £-reduced: then Lemma 10 gives the result through 
the z-th loop iteration. 

— ||a)r’*'^ll < (l—£)||a(;'^^||: because pi +1 < pi, we have the inequalities < 

Hf\\ = \\alf\\<{l-e)\\cd+% 

— We are in the 211-case, i.e. d = 4 with |a;,ri| = 2 and the other |a;j|’s are all 
equal to 1, we refer to the detail analysis of subsection 4.3. 

— Otherwise, we apply Lemma 11, which gives the expected result through the 
(z -I- l)-th loop iteration. 

We described our strategy to prove that the greedy algorithm is polynomial- 
time up to dimension four. One can further prove that the bit complexity is in 
fact quadratic, by carefully assessing the costs of each loop iteration and com- 
bining them, but the proof is much more technical than in the two-dimensional 



case. 
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4.3 Concluding in Dimension Four 



In the previous subsections, we showed that the greedy algorithm is polynomial- 
time in dimensions two and three, but we noticed that a new difficulty arose 
in dimension four: the Gap Lemma is useless in the so-called 211-case. This 
is because there are three-dimensional Minkowski-reduced bases [6i , 62 , ^3] for 
which 2 bi + s\bj + s^bk - with {i, j,fc} = {1,2,3} and |si| = |s2| = 1 - is a 
Voronoi vector. Indeed consider the lattice spanned by the columns bi, 62, 63 of 
the following matrix: 



M = 



1 1 -1 
1 -1 0 
0 0 1 



This basis is Minkowski-reduced and ||bi -I- 62 + 263 1| = ||bi + b2|| < ||(2A:i + 
1)61 -I- (2^2 -I- l)bi -I- 2^36311 for any ki,k2,ks G Z. Therefore, a vector in the 
Voronoi cell centered in bi -|- 62 -I- 263 can avoid being significantly shortened 
when translated inside the Voronoi cell centered in 0. 

The Gap Lemma cannot tackle this problem. However, we note that (1, 1,2) 
is rarely a Voronoi coordinate (with respect to a Minkowski-reduced basis), and 
when it is, it cannot be a strict Voronoi coord: we can prove that if (1,1,2) is a 
Voronoi coord , then 1 1 bi H- 62 1 1 = 1 1 bi -I- 62 + 263 1 1 , which tells us that bi -|- 62 + 263 
is not the only vector in its coset of Lj 2 L reaching the minimum. It turns out 
that the lattice spanned by the columns of M is essentially the only one for 
which (1,1,2) - modulo any change of sign and permutation of coordinates - 
can be a Voronoi coord. More precisely, if (1, 1,2) ~ modulo any change of sign 
and permutation of coordinates - is a Voronoi' coord for a lattice basis, then 
the basis matrix can be written as rU M where r is any non-zero real number 
and U is any orthogonal matrix. Since a basis can be arbitrarily close to one 
of these without being one of them, we need to consider a small compact set of 
normalized bases around the annoying ones. More precisely, this subset is: 



|[bi, b2, bs] £-reduced / 3 <t G 5s, | 



:|G'(b£r(i), bcr(2)) ^ct(3))I~|-^*-^I IIoo < £}, 



for some sufficiently small £ > 0, where ||M||oo is the maximum of the absolute 
values of the matrix M and |M| is the matrix of the absolute values. 

Now, consider we are in the 211-case at some loop iteration i+ 1 . We distin- 
guish three cases: 

— is outside the compact. In this case, a variant of the Gap 
Lemma (Lemma 29) proved in Section 5 is valid, and can be used to show 
that is significantly shorter than 

~ 0,^2^^, 03^^] is inside the compact, but a'‘^^ is far from the Vorono'i cell 

Vor(a}^^, ®3~''^)- In this case, b^^^ is significantly shorter than 

— Otherwise the overall geometry of 03“''^, is very precisely 

known, and we can show that there remain at most 0(1) loop iterations. 



More precisely, by using Lemma 29, we show that: 
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Lemma 13. There exist three constants C,e > Q, and c G Z such that the fol- 
lowing holds. Consider an execution of the four- dimensional greedy algorithm, 
and a loop iteration i -\- 1 for which pt+i < pi, is e-reduced, 

llapl^ll > (l-e)l|a 4 +^||, and (|a;„(i)|, \x„(^ 2 )\, ^^( 3 ) 1 ) = ( 1 , 1 , 2 ) for some permu- 
tation a o/{l,2,3}. Then either ||a4’^^|| > (1 -I- C')||b4"''^|| or: 



1 0 1 0 



2 2 2 

0 0 1 ij 

In order to prove this result, we restrict more and more the possible geometry 
of [a\,al 2 ,a\,a]f^. Note that this critical geometry corresponds to the root lat- 
tice ZI 4 . We treat this last case by applying the following lemma, which roughly 
says that if the Gram matrix of a basis is sufficiently close to some invertible ma- 
trix, then the number of short vectors generated by the basis remains bounded. 

Lemma 14. Let A he an invertible d x d matrix, and B > 0. Then there exists 
e,N > 0 such that for any basis ( 61 , ... , bd), if \ \G{bi, . . . , bd) — ^||oo < s, then: 



d.+ l||2 



1+1 „i+l 



l^(®cr(l) , ®ct(2) > “(t(3) ’ “4 



i-l-l'i 



— ^1 loo < £, with : A = 



\{{xi,...,Xd) / \\xibi-\- ...-\-XdbdW< B}\ < N. 



4.4 Failure in Dimension Five 

In this subsection, we explain why the analysis of the greedy algorithm breaks 
down in dimension five. First of all, the basis returned by the algorithm is not 
necessarily Minkowski reduced, since greedy and Minkowski reductions differ in 
dimension five. Consider the lattice spanned by the columns of the following 
matrix: 

'2 0 0 0 1 ' 

0 2 0 0 1 

0 0 2 0 1 , 

0 0 0 2 1 

0 0 0 0 £ 

where 0 < £ < 1. This basis is clearly greedy reduced, but the vector (0, 0, 0, 0, e)* 
belongs to the lattice. Moreover, for a small £, this shows that a greedy reduced 
basis can be arbitrarily far from the first minimum, and can have an arbitrarily 
large orthogonality defect: HbgH is very small towards HbsH. For £ close to 1, 
this basis shows that the length decrease factor through one loop iteration of the 
five-dimensional greedy algorithm can be arbitrarily close to 1 . 

Nevertheless, the greedy algorithm is well-defined in dimension five (the four- 
dimensional greedy algorithm can be used for Step 4, and since it returns a 
Minkowski reduced basis, the CVP algorithm of Theorem 5 can be used in 
Step 5). Despite the fact that the algorithm does not return a Minkowski re- 
duced basis, one may wonder if the analysis remains valid, and if the number 
of loop iterations of the 5-dimensional greedy algorithm is linear in log||b 5 ||. 
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The analysis in dimensions two, three and four essentially relies on the fact that 
if one of the xj's found at Step 5 has absolute value higher than 2, then || 6 ^|| 
is significantly shorter than ||a^||. This fact is derived from the so-called Gap 
Lemma. In dimension four, this was only partly true, but the exception (the 
211 -case) happened in very few cases and could be dealt by considering the very 
specific shape of the lattices for which it could go wrong. Things worsen in di- 
mension five. Indeed, for Minkowski-reduced bases, (1,1, 1,2) and (1,1, 2, 2) - 
modulo any change of sign and permutation of coordinates - are possible Voronoi' 
coords. Here is an example of a lattice where (1, 1, 2, 2) is a Voronoi coord: 

'1-10 O' 

11-10 
0 0 1-1 ■ 

0 0 0 1 

The lattice basis given by the columns is Minkowski-reduced, but: 

I |bi-l-b2T2b3-|-264| I = 2 = ||bi-|-b2|| ^ | |(2A:i-|-l)bi-l-(2fc2-l-l)b2“l“2fc363-|-2A:4b4| |, 

for any fci, ^ 2 , ^ 3 , ^4 G Z. Note that (1, 1, 2, 2) cannot be a strict Voronoi coord: 
if bi + b 2 + 263 -I- 264 reaches the length minimum of its coset of T/2L, then so 
does bi -|- &2 • Thus it might be possible to work around the difficulty coming from 
(1, 1, 2, 2) like in the previous subsection. However, the case (1, 1, 1, 2) would still 
remain, and this possible Voronoi coordinate can be strict. 

5 The Geometry of Low-Dimensional Lattices 

In this section, we give some results about Voronoi cells in dimensions two and 
three, which are crucial to our complexity analysis of the greedy algorithm de- 
scribed in Section 3. More precisely, the analysis is based on the Gap Lemma 
(subsection 5.3), which is derived from the study of Voronoi cells in the case of 
£-reduced vectors (subsection 5.2), itself derived from the study of Voronoi cells 
for Minkowski-reduced bases (subsection 5.1). 

5.1 Voronoi Cells in the Case of Minkowski- Reduced Bases 

We give simple bounds on the diameter of the Voronoi cell and on the Gram- 
Schmidt orthogonalization of a Minkowski-reduced basis: 

Lemma 15. Let d > 1. Let [bi, . . . , bj\ he a basis of a lattice L. Then p{L) < 
^||b£;||. As a consequence, if d < 4 and if [bi, . . . ,bd] is a Minkowski-reduced 
basis, then ||b5|| > ^^^~‘^ ||brf||. 

The following lemma provides the possible Voronoi vectors of a two-dimensional 
lattice given by a Minkowski-reduced basis. Such a basis confines the coordinates 
of Voronoi vectors: 
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Lemma 16. In dimension two, the possible Vorono'i eoords are (1, 0) and (1, 1), 
modulo any change of signs and permutation of coordinates, i.e. any nonzero 
(£1,62) where |£i|,|£2| < 1- 

The proof relies on a detailed study of the expression ||(2xi-|-£i) -6i-|-(2a;2 + £2) • 
^2!^ — Iki^i + £2^2!^, where [bi,b2] is Minkowski-reduced, £i,£2 G {0, 1} and 
X\,X2 G Z. Indeed, since Voronoi coords of a lattice L are given by the minima 
of the non-zero cosets of L/2L, it suffices to show that if X\X2 yf 0, then this 
expression is strictly positive. We do this by a rather technical study. 

We generalize this analysis to the three-dimensional case. The underlying 
ideas of the proof are the same, but because of the increasing number of variables, 
the analysis becomes more tedious. 

Lemma 17. In dimension three, the possible Voronoi coordinates are (1,0,0), 
(1,1,0), (1,1,1) and (2,1,1), modulo any change of signs and permutation of 
coordinates. 

The possible Voronoi coord (2,1,1) creates difficulties when analyzing the 
greedy algorithm in dimension four, because it contains a two, which cannot be 
handled with the greedy argument used for the ones. We tackle this problem as 
follows: we show that when (2, 1, 1) happens to be a Voronoi coord, the lattice has 
a very specific shape, for which the behavior of the algorithm is well-understood. 

Lemma 18. Suppose [61,62)^3] is a Minkowski-reduced basis. 

1. If any of (si,S2,2) is a Vorono'i coord with Si = ±1 for i G {1,2}, then 
ll^ill = 11^211 = llbsll, (61,^2) = 0 and {bi,b3) = -Si||6i|p/2 for z = 1,2. 

2. If any of (51,2,53) is a Voronoi coord with Si = ±1 for i G {1,3}, then 
||bi|| = ||b2||- Moreover, z/|| 5 i|| = H62II = WhW, then (61,63) = 0 and (6^,62) = 
-5i||6i|p/2 for z = 1,3. 

3. If any of (2, 52, 53) is a Voronoi coord with 5j = ±1 for i G {2, 3} and | |6i 1 1 = 
II62II = II63II, then (62,63) = 0 and (6*, 61) = -Si||6i|p/2 for z = 2,3. 

5.2 Voronoi Cells in the Case of e-Reduced Vectors 

We extend the results of the previous subsection to the case of £-reduced vec- 
tors. The idea is that if we compact the set of Minkowski-reduced bases and 
slightly enlarge it, the possible Voronoi coords remain the same. Unfortunately, 
by doing so, some of the vectors we consider may be zero, and this creates an 
infinity of possible Voronoi coords: for example, if 61 = 0, any pair (xi,0) is a 
Voronoi coord of [61, 62]. To tackle this problem, we restrict to 6^ with “similar” 
lengths. More precisely, we use the so-called Topological Lemma: if we can guar- 
antee that the possible Voronoi coords of the enlargement of the initial compact 
set of bases are bounded, then for a sufficiently small enlargement, the possible 
Voronoi coords remain the same. We first give rather simple results on £-reduced 
vectors and their Gram-Schmidt orthogonalization, then we introduce the Topo- 
logical Lemma (Lemma 21), from which we finally derive the relaxed versions of 
Lemmata 16, 17 and 18. 
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Lemma 19. There exists a constant c > 0 such that for any sufficiently small 
e>0, z/ [61,62,^3] are e-reduced, then the following inequalities hold: 

1 — 1 “ 

\{bt,bj)\< ^ ||b,||^ foranyi<j, 

1 “I” cs 

|(b3,£i6i + £262)1 < — ^ — ||ei6i + £262!^ for any £i,£2 G {-1, 1}. 

This result implies that if [63 , , b^] are £-reduced, the only case for which 
the bfs can be linearly dependent is when some of them are zero, but this case 
cannot be avoided since we need compacting the set of Minkowski-reduced bases. 
The following lemma generalizes Lemma 15. It shows that even with £-reduced 
vectors, if the dimension is below four, then the Gram-Schmidt orthogonalization 
process cannot arbitrarily decrease the lengths of the initial vectors. 

Lemma 20. There exists C > 0 such that for any 1 < d < 4 and any sufficiently 
small £ > 0, if [bi , . . . , 6 ^] ore e-reduced vectors, then we have || 65 || > C||bd|l- 

The Topological Lemma is the key argument when extending the results 
on possible Voronoi' coords from Minkowski-reduced bases to £-reduced vectors. 
When applying it, Xq will correspond to the xfs, Kq to the bfs, X to the 
possible Voronoi coordinates, and / to the continuous function of real variables 

/ : {bz)z — Wvibi -k . . . -k VdbdW- 

Lemma 21 (Topological Lemma). Let n,m> 1. Let Xq and Kq be compact 
sets o/K" and K™. Let f be a continuous function from Kq x Xq to K. For any 
a G Kq we define Ma = {a; G Xq C\IF j f{a,x) = minx' ^Xonix-{f {o,x'))} . Let 
K C Kq be a compact and X = Uaeic-^a C VoflZ". With these notations, there 
exists £ > 0 such that if b € Kq satisfies dist{b,K) < e, we have Mb C X. 

In order to apply the Topological Lemma, we need to map the relaxed bases 
into a compact set. For any £ > 0 and any a G [0, 1], we define: 

K2{e,a) = {(61, 62)761, 62 £-reduced, a < ||6i|| < H62II = 1} 

Kq{£,o) = {(61,62,63)761,62,63 £-reduced, a < ||6i|| < H62II < H63H = 1}. 



Lemma 22. Lf e > 0 and a G [0, 1], K 2 {e,a) and KQ{e,a) are compact sets. 

The following lemma is the relaxed version of Lemma 16. It can also be 
viewed as a reciprocal to Lemma 19. 

Lemma 23. For any a G ]0, 1] and any sufficiently small £ > 0, the possible 
Voronoi' coords of [ 61 , 62 ] G K 2 {e,a) are the same as for Minkowski-reduced 
bases, i.e. ( 1 , 0 ) and ( 1 , 1 ), modulo any change of signs and permutation of co- 
ordinates. 



We now relax Lemma 17 in the same manner. 
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Lemma 24. For any a G ] 0 , 1 ] and any sufficiently small e: > 0 , the possible 
Voronoi’ coords of [bi, 62, 63] G a) are the same as for Minkowski-reduced 

bases. 

The following result generalizes Lemma 18 about the possible Voronoi coord 
(1,1,2). As opposed to the two previous results, there is no need to use the 
Topological Lemma in this case, because only a finite number of (xi, ai2, xs)’s is 
considered. 

Lemma 25. There exists a constant c' > 0 such that for any sufficiently small 
e>0, if[bi, 62, ^>3] are e-reduced and ||b3|| = 1 , then: 

1. If any of (si,S 2 j 2 ) is a Voronoi' coord with Si = ±1 for i G { 1 , 2 }, then: 

ll^ill > 1 - c'e, |(bi,b2)| < c'e and |(b*, 63) + < c'e for i = l,2. 

2. If any of (51,2,53) is a Voronoi coord with Si = ±1 for i G { 1 , 3 }, then 

(1 — c'£r)||b2|| < ll^ill < ll^2||- Moreover, z/ ||bi|| > 1 — e, then: |(6i, bs)] < c'e 
and \{bi, 62) -I- < c'e for z = 1 , 3 . 

3. If any of (2,52,53) is a Voronoi coord with Si = ±1 for i G { 2 , 3 } and if 

ll^ill > 1 — then: |(b2, ^3)] < c'e and \{bi, bi) | < c'e for i = 2,3. 

5.3 The Gap Lemma 

The goal of this subsection is to prove that even with relaxed bases, if one adds 
a lattice vector with not too small coordinates to a vector of the Voronoi cell, 
this vector becomes significantly longer. This result will be used the other way 
round: if the xfs found at Step 5 of the greedy algorithm are not too small, then 
bd is significantly shorter than ad. We first need to generalize the compact sets 
K2 and K3. For any e > 0 and any a G [ 0 , 1 ], we define: 

K'2(e,a) = {{bi,b2,u)/{bi,b2) G K2{e,a), u G Vor(bi,b2)} 

K'^{e,a) = {{bi,b2,b3,u)/{bi,b2,b3) G K3{e,a), u G Vor(bi,b2)}- 

Lemma 26. If e > 0 and a G [ 0 , 1 ], K'2{e,a) and K'^{s,a) are compact sets. 

The next result is the two-dimensional version of the Gap Lemma. 

Lemma 27. There exist two constants £, C > 0 such that for any e-reduced 
vectors [bi, b2] and any u G Vor(bi, b2), if at least one of the following conditions 
holds, then: \\u -\- x\bi -\- X2b2p > ||m|P + C||b2|p. 

- (1) |X2| >2, 

-( 2 ) |xi| >2 and ||bif>||b 2 ||V 2 . 

We now give the three-dimensional Gap Lemma, on which relies the analysis of 
the four-dimensional greedy algorithm. 

Lemma 28. There exist two constants £, C > 0 such that for any e-reduced 
vectors [bi,b2,b3] and any u G Vor(bi,b2,b3), if at least one of the following 
conditions holds, then: 

||t6 -h Xibi -h X2b2 + X3b3|p > ||m|P -h G||b3|p. 
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- ( 1 ) la^sl > 3 , or = 2 and (|a:i|, \x2\) ^ ( 1 , 1 ); 

- ( 2 ) 11^211 > 11^311/2 and: \x2\ > 3 , or \x2\ = 2 with (|a:i|, |x3|) ^ ( 1 , 1 ); 

- ( 3 ) iibijj > jjb3li/2 and: ja:ij > 3 , or jxij = 2 with (ja;2j, jx3|) ^ ( 1 , 1 ); 

Like in the previous subsections, we now consider the case of the possible 
Voronoi coords (± 1 ,± 1 ,± 2 ) modulo any permutation of coordinates. 

Lemma 29 . There exist two eonstants £, C > 0 such that for any s-reduced 
vectors [61,62)^3] and any u € Vor(bi , b2 , bs) , if at least one of the following 
conditions holds, then: 

||tt + + X2b2 + 2^363 |p > ||m|P + C'|| 63 |P. 

- T {xi,X2,xz) = (si,S2,2), |si| = 1 fori G { 1 , 2 } and at least one of the 
following conditions holds: 

||bi|| < (l-£)||b 3 ||, or ||b2|| < (l-e)||b3||, or |(bi,b2)| > e||b 3 |P, or 
\{bi,bf) + si^^l > e||b3||2, or \{b2,bf) + S2%^| > e||b3|p. 

- 2a- {xi,X2,x^) = (51,2,53), |5i| = l for i& { 1 , 3 } and ||6i|| < (1 - e)||b3||. 

- 2h- {xi,X2,xz) = (51,2,53), | 5 i| = I for i & { 1 , 3 }, || 6 i|| > (1 - £)||b3|| 
and at least one of the following conditions holds: |(6i,b3)| > £||b3|p, or 
\{bl,b2) + Si^i^l > £||b3||2, or 1 (^ 3 , ^2) + S3%iJ-| > £||b3|p. 

- 3-{xi,X2,x:i) = (2,52,53), |5i| = l for i& { 2 , 3 }, ||6i|| > (1 - e)||b3|| and at 
least one of the following conditions holds: 1(62,63)! > £||63|p, or 1(62,61) + 
g^il^ilLi > £||63||2, or 1(63,61) + 53^^%^^-! > £||63|p. 
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Abstract. We study the problem of computing the fc-th term of the 
Farey sequence of order n, for given n and k. Several methods for gen- 
erating the entire Farey sequence are known. However, these algorithms 
require at least quadratic time, since the Farey sequence has 0{n^) el- 
ements. For the problem of finding the fc-th element, we obtain an al- 
gorithm that runs in time O(nlgn) and uses space 0(^/n). The same 
bounds hold for the problem of determining the rank in the Farey se- 
quence of a given fraction. A more complicated solution can reduce the 
space to 0(n^^^(lg Ig and, for the problem of determining the rank 

of a fraction, reduce the time to 0(n). We also argue that an algorithm 
with running time 0(poly(lgn)) is unlikely to exist, since that would 
give a polynomial-time algorithm for integer factorization. 



1 Introduction 

For any positive integer n, the Farey sequence of order n is the set of all ir- 
reducible fractions with 0 < p < q < n, arranged in increasing order. An 
alternative definition could include j and j as special fractions. For example, 
the Farey sequence forn = 5is: The number of elements 

of the Farey sequence is asymptotically + O(nlgn) [5]. 

The Farey sequence is a well-known concept in number theory, whose explo- 
ration has lead to a number of interesting results. However, from an algorithmic 
point of view, very little is known. In particular, the only problem that appears 
to be investigated is that of generating the entire sequence for a given n. Several 
interesting solutions exist for this problem, though none of them presents any 
algorithmic challenge: 

— Sort all unreduced fractions and remove duplicates. The running time 
is O(n^lgn), which is almost optimal, but the space is O(n^). (We assume 
that we are only interested in generating the fractions, not storing them; 
otherwise, quadratic space is clearly the best possible.) 

— The space in the above algorithm can be reduced to 0{n), without changing 
the running time [7]. This uses a priority queue to merge n sequences, where 
the z-th such sequence is j , | , . . . , ^ . 

D.A. Buell (Ed.): ANTS 2004, LNCS 3076, pp. 358-366, 2004. 
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— To obtain the sequence of order n + 1 from the sequence of order n, consider 
all consecutive fractions | and g from the sequence of order n, and insert 
the mediant fraction between them, if the denominator is n + 1 [5] . This 
surprising construction is based on the initial observation made by Farey 
in 1816 [4]. The resulting algorithm is the worst so far: the running time is 
O(n^), and the space is O(n^). 

— Combining several properties satisfied by Farey sequence, one can get a triv- 
ial iterative algorithm, which generates the next Farey fraction, based on the 
previous two ([5], problem 4-61). If ^ and ^ are the last two fractions, the 
next one is given by: 



p" = 



q + n 



p -p, 



q" = 



q + n 



This is an ideal algorithm: it uses O(n^) time, and 0(1) space. 

— The Stern-Brocot tree is obtained by starting with 2 and and repeatedly 
inserting the mediant between any two fractions that are consecutive in the 
in-order traversal of the tree [5]. Farey fractions form a subtree of the Stern- 
Brocot tree, often called the Farey tree. One can generate the Farey fractions 
in order, by recursively exploring the tree. The algorithm requires quadratic 
time, and 0(n) memory (corresponding to the maximum depth of the Farey 
fractions of order n). 



The model of computation assumed by these algorithms, as well as the re- 
maining algorithms from this paper, is the standard word RAM (Random Access 
Machine). Such a machine can access words of O(lgn) bits, and can perform 
usual arithmetic operations on such words in unit time. Space is also counted in 
words. 

In this work, we consider the most natural question in addition to that of 
generating the entire Farey sequence. For given n and k, our problem is to 
generate just the k-th element of the Farey sequence of order n (often called the 
fc-th order statistic [2]). Our motivation is not based on any practical application 
of this problem (we are aware of none), but rather on the algorithmic challenges 
it presents. It seems impossible to obtain good, or even just subquadratic time 
bounds for these problems by modifying the algorithms listed above (the obvious 
choice, and the one to which we devoted the most attention was the solution 
involving the Stern-Brocot tree). Instead, our algorithms will be based on a set 
of rather different ideas. 

Our solution involves algorithms for another natural problem: given a frac- 
tion, determine its rank in the Farey sequence. The bounds we obtain are usually 
identical for both problems. In section 2, we describe a reduction between these 
problems, and design an initial algorithm that runs in time 0{nlg^ n) and uses 
space 0{n). In section 3, we improve the time complexity of this algorithm to 
O(nlgn). Finally, section 4 improves the space complexity to 0{^/ri), while pre- 
serving the running time. 
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We have implemented the final algorithm, as well as the methods described 
above for generating the entire sequence, and found ours to be very efficient, 
both in terms of time, and space. Experimental results or source code can be 
obtained from the authors. 

This result leaves two natural questions unanswered. The first one is how 
little memory suffices for an algorithm with roughly linear, say 0(n • poly(lg n)), 
running time. In particular, reducing the memory to poly(lg n) would be inter- 
esting. In section 5, we present a more complex algorithm, based on a result of 
Deleglise and Rivat [3], which uses space 0(n^/^(lglgn)^/^). The algorithm can 
determine the rank of a fraction in 0{n) time; for our original problem (finding 
a fraction of a given rank) the O(nlgn) time bound is not improved. 

The second question is concerned with time, rather than space. Note that the 
input to the problem consists of just two words, or O(lgn) bits, namely n and 
k. Therefore, there is nothing that prohibits the existence of an algorithm with 
running time sublinear in n. For somewhat related problems, such as computing 
the number of primes less than a certain value, sublinear time algorithms are 
known [1]. It seems reasonable to hope that the running time of the algorithm 
from section 5 can be improved to for some constant £ > 0. We describe 

two subproblems which would be sufficient to obtain such a result, but which 
we cannot solve. In section 6 we argue instead that a much faster algorithm, 
with running time 0(poly(lgn)), is unlikely to exist. More precisely, we show 
that such a polynomial algorithm for our problem would immediately imply a 
polynomial-time algorithm for integer factorization. 

2 An Initial Algorithm 

We now describe a first attempt to solve the problem, which will give an al- 
gorithm running in 0(n Ig^n) time and 0{n) space. This algorithm forms the 
basis of our improved algorithms from the following sections. Our solution uses 
as a subroutine an algorithm for determining the rank of a given fraction (not 
necessarily in reduced form) in the Farey sequence. The subroutine developed in 
this section will run in O(nlgn) time and 0{n) space. 

We begin by a reduction from the order statistic problem to the fraction 
rank problem. Assume we want to compute the fc-th order statistic. We first 
determine a number j such that the answer lies in the interval [^, This 

can be done by binary search, as follows. Assume we have a guess for the value 
of j. Determine the rank of ^ in the Farey sequence. If this rank is at most k 
we know the correct value of j is not smaller than the current guess. Otherwise, 
we should continue searching below j. This stage of the algorithm uses O(lgn) 
calls to the fraction rank subroutine. 

To solve the problem, we must now determine the fraction with rank equal 
to k — rank{^) among all irreducible fractions from the interval [^, Notice 
that there is at most one fraction in this interval for any denominator less than 
n. This follows from the fact that the length of the interval is -, and consecutive 
fractions with denominators q < n are separated by ^ > A In addition, for 
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a given q, this fraction can be found in constant time, since the numerator 



must be 



(j + l)9-l 
n 



. There might not be any fraction in the range for a certain 



denominator, so we also need to check that the fraction we obtain in this way 
lies in the feasible interval. 

Given these properties, it is not hard to find efficient algorithms for this 
subproblem. A particularly easy one would consist of generating all (up to n) 
fractions from the feasible interval, sorting them, and eliminating duplicates. 
Alternatively, one can make a list of just the irreducible fractions, and then use 
a linear time order statistic algorithm [2] on this list. However, we describe a 
more interesting solution. This solution has the advantage that it runs in 0{n) 
time, and uses just 0(1) memory, which will prove important in the following 
section. First, generate the fractions in the range by considering all possible 
denominators. As fractions are generated, keep just the minimum fraction found 
that is strictly greater than At the end, reduce both ^ and the minimum 
fraction greater than it. We now have two consecutive fractions from the Farey 
sequence. As mentioned in the introduction, there is a simple constant-time 
algorithm that can generate the next fraction in the Farey sequence based on 
the previous two. This means that we can iterate through the fractions in the 
range in increasing order, remembering just the previous two fractions. All we 
have to do is count up to the desired rank, and return the corresponding fraction. 



We are now left with giving an algorithm for the fraction rank problem. 
We will actually solve a slightly more general problem: for any real number 
X, determine the number of irreducible fractions ^ < x, with q < n. Let Aq 
be the number of such irreducible fractions with denominator equal to q. Any 
fraction with denominator q is either irreducible, in which case it should be 
counted in Aq, or has a unique reduced representation. The denominator of 
the reduced representation is a divisor of q. This transformation is, in fact, 
reversible: given any irreducible fraction where d is a divisor of q, we can 
multiply both the numerator and the denominator by q/d to get a fraction with 
denominator equal to q. So we have a bijection between the set of all fractions 
in (0,x] with denominator q and the set of reduced fractions with denominator 
d, for all divisors d of q. This gives a recursive formula for Aq, which leads to 
the solution of our problem: 



rank{x) = '^Aq, Aq=\x- q\- ^ Ad (1) 

9=1 d<q, d\q 

The way to translate this formula into an efficient algorithm is obviously not 
the most direct one, since there is no fast way to iterate over all divisors of a 
number. Instead, begin by initializing an array A[l..n] by A[q] = [a; • gj. Then 
consider all numbers q in increasing order from 1 to n. For each q, consider all 
multiples mq, and subtract A[q] from A[mq\. At step number q, we always have 
A[q] = Aq, and this algorithm is a simple reformulation of the recursive formula 
from above. The rank can be computed by summing the final values of all A[q\. 

The running time is O -I- X)g=i f) “ q) = O(nlgn). So we have 
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an O(nlgn) algorithm for the rank problem, and thus an O(nlg^n) algorithm 
for the order statistic problem. Both algorithms use 0(n) space. 

3 Improving the Running Time 

We now improve the time complexity of the algorithm from the previous section. 
Since the running time of our order statistic algorithm is dominated by the 
running time of the fraction rank algorithm, we concentrate on the latter. As 
noted before, the order statistic algorithm makes O(lgn) calls to the fraction 
rank subroutine. The key idea suggested by this fact is that one should try to 
introduce some amount of preprocessing, which gives a one-time cost, in order 
to improve the running time of every call to the fraction rank subroutine. 

To find a way to trade preprocessing for running time of the rank subroutine, 
let us re-examine relation 1. After recursive expansions of all Aq, the resulting 
rank{x) will be a linear combination of terms of the form \ x ■ q\, for all q < n. 
Except for these terms, the recursive formula is independent of x. Therefore, we 
can precalculate the coefficient of every [a: • gj . Now the rank routine becomes 
trivial: for all q, calculate [x ■ q\, and add this value to the rank, weighted by 
the appropriate coefficient. The numbers appearing at intermediate steps in the 
computation, and even the coefficients themselves, may become large. However, 
the end result (the rank) is obviously bounded by n^, so if all computations are 
performed modulo a number greater than that, the result will be correct (this is 
important because we can only manipulate in constant time numbers that have 
O(lgn) bits). This issue is transparent to the implementation, since normally all 
computations are carried out modulo a power of two, such as 2^^ for the usual 
32-bit machines. 

The algorithm for precalculating the coefficients of \_x ■ q\ is symmetric to 
our old algorithm for calculating the rank. It is based on the following recursion 
defining the coefficients: 

C, = l- ^ Ct, for all <7 < n (2) 

t>q, q\t 

The correctness of the formula follows from “reverse induction”, since the 
calculation of Cq uses only coefficients Ct with t > q. Indeed, the term \x ■ q\ 
appears initially in Aq. Then, Aq is subtracted from all its multiples t. At that 
point At contains \x ■ t\ with coefficient 1, and [a; • q\ with coefficient —1. All 
subsequent operations involving At contribute to the total coefficient of [x ■ gj 
exactly by minus the coefficient of [x • tj : since At is the only one that contains 
\x ■ t\ initially, all operations involving At are described by the final coefficient 
of [x -t\. 

The algorithm follows immediately from the formula, and calculates the co- 
efficients from C„ down to Ci. The running time is O f) = O(nlgn). 

This cost is paid once, and every call to the rank subroutine can be answered 
in 0(ri) time, so the total running time of the order statistic algorithm is also 
O(nlgn). 
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4 Improving the Space Complexity 



The key observation for improving the space complexity is that coefficients Cq 
and Cq' are identical whenever [n/q\ = [n/q'\ . This is not hard to see, and also 
offers a good intuition for the new algorithm. Consider some value q. The term 
\x ■ q\ is added initially to Aq. Then Aq, which includes [x ■ q\, is subtracted 
from miq, for all possible nii. The values A^^q, which now contain \ x ■ q\ with 
a coefficient of —1, are then subtracted from Am^miq, and so on. The branches 
of this recursion are only trimmed off when qY[ini > n, which is equivalent to 
> ln/q\. So the coefficient Cq only depends on [n/q\. 

Based on this fact, we observe that there are only distinct values for the 
coefficients among all Cq with q > ^Jn. To avoid such repetitions, we break the 
coefficients into two groups. The coefficients Ci, . . . , Cy^^ are stored as before. 
Instead of storing Cq for q > we store an array D^, such that Cq = 
for any q > Notice that both arrays have 0{^/n) elements. The fraction 
rank algorithm remains trivial. For all q < yCi, we calculate [a; • gj and add it 
to the rank, weighted by Cq. For the remaining g’s, we instead use the weight 

^[ri/qj • 

It remains to show how to precompute the sequences Dr and Cq using just 
0{^/n) space. The computation of Dr is based on the following recursive formula: 



Dr = I (3) 

t^2 

The formula can be obtained by careful relabeling of formula 2. Take a q, 
such that r = \n/q\. Now consider consider our previous recursive formula for 
Cq (slightly rewritten here): 

l"/9j 

C, = l- ^ C,q 

t=2 



By definition, we have Dr = Cq and [n/gj = r. Also, Ctq = Dyn/tq\ ■ The in- 
dex on the right-hand side can be rewritten as j = \ r/t \ , finalizing 



the transformation into formula 3. 

Once we have the values Dr, computing the array Cq can be done as before, 
using relation 2. The only difference is that whenever the algorithm needs Ct for 
t > -y/n, is should instead use Dyn/t\, since we are only computing the values 
Ct for t < y/n. The time required by the computation of the sequence Dr is 
quadratic in the size of the table, which is 0{n). Computing Cq takes O(nlgn) 
time, as before. Finally, rank queries still require linear time, so the overall 
running time is unchanged, and the space is reduced to 0(y/n). 



5 A Better Way to Calculate Coefficients 

This section describes a better way to calculate the coefficients Dr from the 
previous section, resulting in improved, but more complicated, algorithms. It 
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can be seen that is precisely equal to M(r), where M is the summatory 
function of the Mobius function, M{r) = The Mobius function is 

defined by: 



- m(i) = 1; 

~ ii{t) = 0 if t has a squared prime factor; 

— ii{t) = (—1)^ if t = Pi ■ . . . ■ pk, where all pi’s are distinct and prime. 



The identity between our coefficients and M(r) is immediate, since M(r) 
satisfies the recursive formula 3 defining our coefficients ([5], relation 4.61) - this 
is one of the fundamental properties of the Mobius function. Given this identity, 
we can use an algorithm by Deleglise and Rivat [3], which calculates M{r) using 
(9(7-2/3(igig.^)i/3) ^jj]2e and 0(r^/^(lglgr)^/^) space. 

Remember that our algorithm needs the coefficients Cq,q € {1, . . . , n} and 
that Cq = • For all q < we will use the algorithm of [3] to calculate 

M(ln/q\). This first stage will require = 77,5/6+0(1) Calcu- 

lating Cq for all q > can be done by calculating Dr for all r < n®/®. Since 
Dr = Y^\^iP{t), calculating the DrS in the increasing order of r reduces to 
calculating p{t) for all t < n®/®. In turn, calculating /i(t) is trivial if we have the 
prime factorization of t. There exist several factorization algorithms which can 
factor t in time [6], so we obtain a running time of n®/®+°F) for calculating 
all coefficients. The dominant factor in space usage is the space used by a call 
to the subroutine for calculating M(r), which is 0(n^/®(lglgn)^/®). 

The dominant factor in the running time is no longer calculating the coeffi- 
cients. As each Cq becomes available, we need to multiply it by \x ■ gj and add 
the result to an accumulator. Thus, the running time for computing the rank of 
a fraction is 0{n). The time needed to compute the fraction of a given rank is 
still O(nlgn), because we run a binary search on top of the rank computation, 
as explained in section 2. 

Since the algorithm of this section can compute all coefficients in sublin- 
ear time, there is hope of obtaining a sublinear running time overall. We now 
describe a possible approach; however, since we cannot solve the two necessary 
subproblems, this remains speculation. First, to compute ranks in sublinear time, 
we could observe that many consecutive terms of \ x ■ q\ have the same coefficient 
Cq when q > y/ri. Thus, we should look for an algorithm to evaluate 
in time 0{{b — for some constant £ > 0. Even this, however, would not 

imply a sublinear time algorithm for finding the fraction of a given rank. This 
is because the binary search from section 2 can only narrow down the range 
of possible fractions, and does not actually produce an output fraction in the 
form p/q. As described, the range is reduced to [^, we do not know how 

to find a fraction of a given rank from this range without enumerating all 0{n) 
fractions. Alternatively, we could reduce the range to an interval of size 0(l/n^), 
and then we would need to find the unique fraction left in this range. 
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6 Relation to Factorization 

We now show that a polynomial-time algorithm (i.e. one running in 0(poly(lgn)) 
time) for our problem is unlikely to exist, since that would immediately give a 
polynomial algorithm for integer factorization. It is a well-known conjecture that 
such an algorithm does not exist. 

Given a polynomial time algorithm for finding an order statistic, we can 
construct a polynomial time algorithm for finding the rank of a given fraction 
(this is the reverse of the reduction used for our actual algorithms) . Observe that 
we can test whether a guessed rank is too high or too low, using only one oracle 
query to the order statistic algorithm: simply determine the fraction with that 
rank, and compare it with the input fraction. Therefore, we can use galloping 
binary search to solve the problem. We begin by trying powers of two until we 
either obtain a fraction greater than the input fraction, or we exceed the number 
of fractions in the Farey sequence (as reported by the order statistic algorithm). 
Then, we do a binary search for the correct rank in the interval between the last 
powers of two tried. This algorithm makes O(lgn) calls to the order statistic 
algorithm, and no other expensive computations, so it runs in polynomial time. 

Our algorithm for factorization is based on yet another problem: given a 
number n, and k < n that is relatively prime to n, report the number of integers 
in [2, k] that are relatively prime to n. Given a polynomial time algorithm for this 
problem, we can use binary search to find a factor of n. Assume k is relatively 
prime to n (otherwise, we immediately get a factor), and we know the number 
of integers in [2, k] that are relatively prime to n. If this number is k — 1, we 
know that the smallest factor of n is greater than k; otherwise, there is at least 
one factor below k. 

It remains to describe the relation between the Farey sequence and this prob- 
lem. Observe that all numbers i G [2, k] that are relatively prime to n give frac- 
tions ^ in the Farey sequence of order n. To count these numbers, we begin by 
finding the rank of ^ in the Farey sequence of order n. Then, we determine the 
largest fraction, strictly smaller than This can be done by one call to the 
order statistic algorithm, since we already know the rank of Since ^ is irre- 
ducible (by assumption), and it is the mediant of neighboring fractions from the 
Farey sequence, it follows that the preceding fraction has a denominator strictly 
smaller than n. We now find the rank of this fraction, in the Farey sequence of 
order n — 1. Observe that the difference between the rank of ^ and this rank is 
equal to the number of irreducible fractions - < - , which is what we wanted to 
count. 

The reduction from above uses the order statistic oracle in one place, namely 
to find the largest fraction smaller than One may wonder whether the re- 
duction holds if we assume just a polynomial time algorithm for the problem 
of ranking a fraction, proving also the hardness of this problem. The following 
smarter reduction answers this question in the affirmative. Gonsider the frac- 
tions and Since their difference is there exists exactly one fraction 
in this range with a denominator of n — 1. Find this fraction (with 0(1) arith- 
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metic operations), and reduce it, which takes O(lgn) time. Now find the ranks 
of this fraction in the Farey sequences of order n and n — 1. The difference in 
ranks is exactly the number of irreducible fractions ^ possibly plus one 

due to ^ . This nonuniformity is easily fixed by testing whether our fraction with 
a denominator of n — 1 is smaller or larger than so we can again count the 
number of integers from [2, k] that are relatively prime to n. 
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Abstract. We present an algorithm for the computation of the discrete 
logarithm in logarithmic GClass Groups. This is applied to the calcula- 
tion of the £-rank of the wild kernel WK^iF) of a number field F. In 
certain cases it can also be used to determine generators of the £-part of 
WK2{F). 

1 Introduction 

A new invariant of number fields, called group of logarithmic classes, was in- 
troduced by J.-F. Jaulent in 1994 [J3]. The arithmetic of logarithmic classes is 
interesting because of its applicability to A-Theory. Indeed for a given prime 
number I, the Arank of the logarithmic £-class group of a number field F con- 
taining the 2£-th roots of unity equals the Arank of the wild kernel. 

We present a method for the computation of the £-rank of the wild kernel for 
the case where £ is an odd prime number and F does not contain the £-t\i roots 
of unity. 

For the case where £'^ is the exponent of the logarithmic Aclass group C£p,i 
of F and F contains the £™+^-th roots of unity we give a complete logarithmic 
description of the Apart of the wild kernel. 

First we recall the most important definitions from the theory of logarithmic 
Aclass groups and the algorithm for their computation. This is followed by an 
algorithm for the computation of discrete logarithms in these groups (section 2). 
In section 3 we give a short introduction to the wild kernel and derive the algo- 
rithms for the computation of its Arank in a general setting. Section 4 contains 
the complete description of the Apart of the wild kernel through the logarithmic 
Aclass group. This is followed by some examples. 

In the following, £ denotes a fixed prime number and the completion of 
Z with respect to the non-archimedian exponential valuation vt. F denotes a 
number field. 



2 The Logarithmic £-Class Group 

For a detailed presentation of the logarithmic theory see [J3]. A first algorithm 
for the computation of the group of logarithmic classes of a number field F was 
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developed by F. Diaz y Diaz and F. Soriano in 1999 [DS]. We use the algorithm 
from [DJ+] as it removes the restriction to Galois extensions of Q. This algorithm 
uses the ideal theoretic description of the logarithmic Gclass groups. Before we 
discuss it we need some definitions. 

Let p be a prime number and let p be a prime ideal of -F over p. For a € 
xFp X (1 + 2pZp) denote by (a) the projection of a to (1 + 2pZp). Let 
Fp be the completion of F with respect to p. For a G F* we define 



gp{a) 



^ogp{Np^/Q^{a)) 



[Fp 



■ deg„p 



where deg^ p 



Log^p for p yf 

£ for p = £ yf 2; 

4 for p = £ = 2. 



The logarithmic ramification index Cp can be described as follows. The p-part of 
the logarithmic ramification index Cp is [gp(F'p ) : Zjp], For all primes q with q ^ p 
the q part of Cp is the q part of the ramification index Cp of p. The logarithmic 
inertia degree fp is defined by the relation Spfp = Cp/p = deg(F/Q), where fp is 
the classic inertia degree. We use it for the definition of the logarithmic degree 
of a place p: 



degfP := /pdeg^p. 



Furthermore we set 

Log^(NFp/Q^(a:)) ^ 

Vp(x) := — for X G TCp = Zi (Siz x ■ 

deg^p 

We define the group of Gideals 

Fdp/ ■= |ci = ripff I = 0 for almost all p|, 
and for a = Y\p\t P^" ^ Fdp/ set deg^ a := X^pff o;p/p deg^p. We denote by 
Ic?F/ := |ci G Fdp^e I deg^ a = o| 



the subgroup of Gideals of degree 0 and by 

Frpj := |rip|<!P"'’^“^ I a G and Fp(a) = 0 Vp | 

the subgroup of principal Gideals having logarithmic valuations 0 at all Gadic 
places. The group of logarithmic Gclasses is isomorphic to the quotient of the 
latter two: 



— Zdp^e/Prp^i. 



The generalized Gross conjecture (for the field F and the prime £) asserts 
that the logarithmic class group C£p^e is finite (cf. [J3]). This conjecture, which is 
a consequence of the p-adic Schanuel conjecture, was only proved in the abelian 
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case and a few others (cf. [FG,J4]). Nevertheless, since is a Z^-module 

of finite type (by Gadic class field theory), the Gross’ conjecture just claims 
the existence of an integer m such that kills the logarithmic class group. In 
we present a method for the computation of an upper bound for m. That 
algorithm does not terminate in general if Gross’ conjecture is false. This upper 
bound can be used as the Gadic precision in the computation of the logarithmic 
class group. 



2.1 Generators and Relations of C£f,£ 

Let ai , . . . , at be a basis of the ideal classgroup C£p of F with gcd(aj, £) = 1 for all 
1 < z < t. Denote by pi, . . . , pg the Gadic places of F. Let Oi, . . . , Og be elements 
of Tip = 2,1® F* with Vp.{aj) = 6ij {i,j = 1, ■ . ■ , s) and gcd((o;i),f) = 1 for all 
1 < z < s. Set at+i := (oz) for 1 < z < s. For an ideal a of F denote by a the 
projection of a from 0p to 0p|(^) P^^- We distinguish two cases: 

I. If deg^(az) = 0 for all I < z < t + s then set bi := a^. The group C£f^ is 
generated by bi, . . . , bt+g. 

II. Otherwise let 1 < j < t+s such that ve{degi{aj)) = mini<z<t+g V£{deg^{ai)) . 

If we have a = a = a“^ Pr for an ideal a G Id then 0 = 

deg(a) = ajdeg^(az), thus -aj = X) oz deg^(az)/ deg^(aj). Set b^ := 
with di = mod where > exjp{C£F,i)- The group C£f,£ 

is generated by bi, . . . , bj-i. bj+i, • ■ • , bt+g. 

Obviously the ideals hi , . . . , fit are representatives of generators of the group 
C£' := C£f/ {pi, . . . , pg). Let (az^) . be the corresponding relation matrix. The 

relations between the generators hi , . . . , ht of C£' are of the form = («) 

with a G Pf- There exist integers ci, . . . ,Cg such that (a) = OXi mod 
Pr. This yields the relation = n: =1 mod Pr. We can derive all 

relations involving the generators hz + Pr from their relations as generators of 
the group CG in this way. 

The other relations between the generators of C£ are obtained as follows: A 
relation between the generators ZIz is of the form = (1) mod Pr or 

equivalently uuM^r' ■uUif>r = (a) for some a GPf- The last equality is 
fulfilled if and only if IlXi Pf’ is principal, i.e., if IlXi pr* is an (^)-unit. Let 
7 i, . . . , 7 r be a basis of the (£)-units of Pp- set Vzj := Vp. (ji) (1 < z < r, 2 < 
j < s) . We obtain the relation matrix 



/^l,l 


Opt 


-Cl, 2 ■ 


• ■ -Cl,g \ 


Ki 


.. bt,t 


-Ct,2 ■ 




0 


.. 0 


Cl, 2 • 


• ■ Vl,s 


V 0 


.. 0 


Vr,2 • 


• ■ '^r,s / 



M := 
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For the two cases we obtain: 

I. {{bu...,bt+s) , M) are generators and relations of C^. 

II. Let j be chosen as above. Denote by N the matrix obtained by removing the 
j-th column from M. Then ((bi, . . . , bj-\, b^+i, . . . , bt+s),N) are generators 
and relations of C£. 

This gives the following algorithm: 

Algorithm 1 (Logarithmic Classgroup) 

Input: a number field F and a prime number £ 

Output: generators g and and a relation matrix iJ for C£p,i 

— Determine a bound £'^ for the exponent of and use it as the precision 
for the rest of the algorithm. 

— Compute generators ai,...,at of Cf = C£f/(Pi, • ■ • , Ps), where pi,...,pg 
are the ideals of F over £. 

— Determine Ui+i = (oi), . . . , Uj+s = (a^) with Vp.(aj) = Sij. 

— Compute generators g := (bi, . . . , bt+s)^ with deg(bi) = 0 from ai, . . . , at+s 
(i = l,...,t + s). 

— Compute a relation matrix M between the generators g. 

— In case II. remove the j-th column from M and the j-th generator from g. 

— Compute the Cadic Hermite normal form FI of M . 

— Return {g,H). 

The Smith normal form of H and the respective transformations of the gen- 
erators yield a basis representation of C£f,(- 

2.2 The Discrete Logarithm in C£p^g 

Let a G 2d. Let g = (bi, . . . , b^)^ be a vector of generators of C£. The discrete 
logarithm algorithm returns a vector c = (ci, . . . , Cr) such that 

= b^\ . . b|i'^ = a mod Vr. 

We use the notation from above and proceed as follows: 

Let a G 2d. There exist j € TZp and ai,. . . ,at G such that a = rii=i ’ 
( 7 ). Set gi := Vp^{j) for 1 < i < s. Now 

a - n:=i c • (( 7 ) • 

By the definition of 2d we have 

a = a = nil aT ■ • (n;=i (^) 

As Vp^ (( 7 ) = 0 for t = 1, . . . , s we obtain 

For the two cases we obtain: 
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I. (oi, . . . , at, (/i, . . . , (/s) is a representation of a in CiF,e- 
II. Let (ci , . . . , Ct-|_s) (ai,...,at,^i,...,^s) then (ci , . . . , Cj_ ^ , . . . , Ct-|_s ) 

is a representation of a in CiF,i- 



3 The Wild Kernel 



In the following we develop an algorithm for the computation of the wild kernel 
of a number field F. The study of the iL2-group of F, which can be defined as 



K 2 {F) = \ xeF\{ 0 ,l}), 



is difficult [T1,T2]. In order to understand the structure of K2{F) one constructs 
a morphism from K2{F) to a known abelian group whose kernel is finite. Let p 
be a non-complex place of F and let Fp be the completion of F at p. Denote by 
the torsion subgroup of F* . We define 



hp : Fp X Fp* -)> ^p, (x, y) 



rrip/— wp(y)-l 

yx 



where rup = |/ip| and where Wp is the Artin map. It follows from the multiplica- 
tivity of the norm residue symbol and Kummer theory [Gr, pp. 195-197] that 
the map hp is a Z-linear map which is trivial for elements of the form (cc, 1 — a;) 
where x G F\{ 0 , 1}, i.e., hp is a symbol, hp gives us a map from K2{F) to /Tp, 
which we also denote by hp . The wild kernel of K2 is 

WK2{F) = {X G K2{F) I hp{X) = 1 for all non-complex places p of F}. 



Garlands theorem [Ga] states that WK2{F) is finite. There exist idelic [Ko] and 
cohomologic methods for studying the wild kernel. We chose to use logarithmic 
methods as they allow for the use of an algorithmic approach. Another approach, 
which could be exploited algorithmically, is F. Keunes description of the Apart 
of WK2{F) in terms of the ideal class group of Apower cyclotomic extensions. 

The following theorem by Jaulent [J2] establishes the relationship between 
the wild kernel and the logarithmic Aclass groups C^F,l for the case where F 
contains a primitive 2£^-th root of unity . 

Theorem 2. Assume that (2(1 G F* ■ Let 9 G N, g > 1. For every divisor 
a = X)p ®pP e>f degree 0 there exists X G K2{F) such that hp{X) = . If 

C2(.i G F* then the map 



<t> : yi. Gz CIf,i -G W K2{F) jW K2{fY 



defined by 



G n I — )■ X^ 



is an isomorphism. 
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Moore’s reciprocity law [Mo] assures the existance of such an X. 

Corollary 3. If F contains the 2 £-th roots of unity, then 

rdcakiW K2{F) = rank^Cff^^. 

The algorithm in [DJ+] computes the structure of and therefore the £-rank 
of C^F- Thus by the theorem above, the £-rank of the wild kernel is known if F 
contains the 2 ^-th roots of unity. 

3.1 F Does Not Contain the 2£-th Roots of Unity 

If f = 2 and i ^ F the group of positive divisor classes can be used for the 
description of the 2-rank of the §wild kernel [JS2]. We deal with the remaining 
case and therefore assume in the following that £ is odd. 

Let Q be a primitive ^-th root of unity. Let F' be the Galois extension F{Q). 
Let d = |Gal(F'/F)|. We have d | (^ — 1) and therefore gcd(£, d) = 1. In other 
words d G Z|. 

There is an idempotent 62 G Zi[Gal{F' / F)] with 62 = 2 X^creGaFF'/F) 
where ka G such that C'’’ = for all cr G Gal(F'/F). We construct such an 
element 62 in the next section. 

Proposition 4 ([JSl]). If £ is odd and F does not contain the 2 £-th roots of 
unity then 

'62 

rank^ WK 2 {F) = rank^ C^ f(g),£- 

For a better understanding we give a more detailed proof than in [JSl]. 

Proof. Let F' := F{Q). Set A := Gal{F'/F). Because F' contains the 2Gth 
roots of unity the isomorphism 

C£f' = WK 2 {F')IWK 2 {F'Y 

holds (Theorem 2). As Z\ acts on K 2 {F') such that {x,y}'^ = {x'^ ,y'^} for all 
a € A and all (x,y) G it follows that 

= {WK2{F')/WK2{F'YY" 

for 6 i = ^ Xo-s/i ^ does not divide d the idempotent ei induces a surjective 
morphism ^ Tr where Tr is called transfer from the Gpart of K 2 {F') to the Apart 
of K2{F). Therefore W K2{F) /W K2{FY is the image of W K2{F') /W K2{F'Y 
under the restriction of the transfer map Tr. Hence 

{ill Gz'CIf'Y" = WK2{F)fWK2{FY. 

For a G T>£f' we have 

(C G = n (C ® a)" = n C" G a" = n ® = n ^ ® 

(T^A u^A u^A aeA 
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and 

(C G) ® n =C®a''"- 

a^A 

Therefore 

Remark 5. It is possible to generalize this method to the computation of the Im- 
part of the higher analogs W K2i-2{F) (1 > 2) oiW K2{F) which were introduced 
by G. Banaszak [Ba]. We will treat them in a future article. 

Example 6 ([JSl]). If £ = 3 and F = Q(-\/d) with d G squarefree then 
F' = with cyclic Galois group Gal(F'/F) = (r) and (3 = G F'. 

Because Q = we set 62 = 1/2(1 — r). We have 

rank3 WK2{F) = rank^Cip, 

Because 62 = 1/2(1 — r) = 1/2(1 + a) where {a) = Gal(F'/F*) with E* = 
Q{\^—3d) we obtain 

ranks WK2{F) = ranks = ranks 

and 

ranks lTiL2(Q(’/d)) = ranks C^Q(y33d) • 

This is particularly interesting as we do not need any computations in the ex- 
tension E(/s)- 

3.2 Computing 62 

Let d := |Gal(F'/F)| and let cr be a generator of Gal(F'/E). We are looking 
for an element e G Z^[Gal(F'/F)] with e = e^. The element e is of the form 
e = 2 ^ith ki Gil {Q <i < d). Hence the condition e = becomes 

( d-l \ /d-l \ d-l 

Y.kua^\ =dY^ha\ 

Li— 0 / Vf— 0 / i— 0 

Let £'^ be the exponent of C£p,t- It is obvious that it suffices to compute e up 
to a precision of m i-adic digits. Set 

Si := { (u, u) G I u,v G {0, . . . , d — 1} , u + V = i mod d} . 
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For 0<i<(i— Iwe solve the congruences 

ku ■ kv = dki mod . 

{u,v)^Si 



We write ki as with unknown Xij G {0,...,£ — 1} (0 < i < d, 

0 < j < m). Thus our congruences become 

( m-l \ /m-l \ /m-l \ 

XI I mod (1) 

, , , , i=o / \i=o / \i=o / 

We start by solving them modulo £: 

y ^ ^u,0^v,0 — dXi o- 

(u,v)GSi 

Let a G Ff be a generator of the cyclic group F|. Set <5 = then has order 
d in F|. Let a be a representative of in Z^. The elements ao,o = !> ai,o = a, 
02,0 = o^) • ■ • ) cid- 1,0 = are solutions for xq,o, ■ ■ ■ , Xd-i,o- 

Assume that we have found aij — 1} {0<i<d, 0<j<w<m) 

such that 

( to-l \ /w-l \ /w-l \ 

X ) ( X ) + d( X ) = 0 mod 

i=o / \i=o / \i=o / 

With (1) we obtain 

E rr f^n „ 4- T P^n „ = rl-r J- 4 . mnd 7“'+^ 

^V,(J I '^V,W'^ ^U,0 — ~ -^t,w A1J.WU. L. 

(u,v)GSi 

and as = 0 mod this becomes 

y ^ Xu^w(^v,o d~ Xy^rir 0^,0 dxi^yj = mod £ (2) 

{u,v)GSi 

for i = 1, . . . ,d—l which is a system of d linear equations in d variables over F^. 

Therefore we obtain a solution to (1) by first computing ao,o> ■ • • jOd-i,o as 
described above and then solving systems of linear equations (2) inductively for 
w = 1, . . . , m — 1 to obtain values ao,™, . • . , ad-i,w for xo,w, ■ ■ ■ , Xd-i,w 

3.3 Computing the £-Rank of the Wild Kernel 

62 

By proposition 4 the Arank of the wild kernel of F equals the Arank of C£p(^Q-^ ^ 

Let bi, . . . , br be a basis of C£F((t)/ and let £^' be the order of b^ in C£p{Q),i 
{1 <i < r), i.e., 



2 = 1 
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The elements , . . . , b^ are generators of For 1 < z < r the discrete 

logarithm in gives representations (zzi^i, . . . ,rii^r) of the b*^ with 

bf = b”'’" •••C’’' mod^. 

Let A G such that 

( 0 ni4 . . . ZZrp \ 

: : 7l = 0. 

\ 0 ni^r ■ ■ ■ rir^r / 

where 7li,7l2 G A 2 is a relation matrix of the sub- 
group generated by bi, . . . , b^ which are represented by (rzi,i, . . . , rii^r) 

(1 < z < r). Denote by the £-adic Hermite normal form of A 2 . Then 

rank^ LFiF 2 (F') = i = #{hi^i | 1 < z < r, hi^i yf 1}. 

4 A Complete Description of the £-part of the Wild 
Kernel 

Let F be a number field. Let £ be a prime number. Let := expCfir,^. If 
F contains a primitive f^+^-th root of unity C^m+i then we obtain a complete 
description of the Cpart of the wild kernel of F as follows: 

Assume that C£p^i is not trivial, then 

r 

C£f,^ = 0Z/r^Z[a,]. 

i=l 

Therefore there exist a family ( 0 ;^)^ C TZp = Z^ (Dz F* such that P'ai = (at) for 
I < i < r. We denote by {•, •} the canonical map 

{■,■}: F* X F* ^ K2{F); 

it is called Steinberg’s symbol. The Cpart of the wild kernel is [So] 

r 

0Z/r^Z{On,,aJ. 

i=l 

Let a G TZp. We denote by a the approximation of a to a precision of m Aadic 
digits. As Steinberg’s symbol is Z^-bilinear we have {Q’^i,a} = {C^"i,a} for all 
a GlZp. Therefore the Apart of the wild kernel is 

r 

0Z/r^Z{On,,a,}. 
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5 Examples 



All algorithms presented here have been implemented in the computer algebra 
system Magma [C+]. The groups are given as lists of the orders of their cyclic 
factors. By i we denote a root of + 1, by (m we denote a primitive m-th root 
of unity. 

K. Belabas and H. Gangl have developed an algorithm for the computation 
of the tame kernel K 2 OF [BG]. The following table contains the structure of 
K 2 OF as computed by Belabas and Gangl and the Grank of the wild kernel 
WK 2 {F) calculated with our methods. The starred entry is a conjectural result. 



F 


K 2 OF 


£ 




'62 

C^F{Q),i 


Ta,nki(W K2{F)) 


Q(V-331) 


[3] 


3 


[3,3] 


[3] 


1 


Q(7=M7) 


[3] 


3 


[3,9] 


[3] 


1 


Q(V=472) 


[5] 


5 


[5,5] 


[5] 


1 


Q(y=m) 


[5] 


5 


[5,5] 


[5] 


1 


Q(\/=696) 


[42] 


3 


[3] 


[1] 


0 






7 


[7,7] 


[7] 


1 


Q(V=7^) 


[2, 18]* 


3 


[3,3] 


[3] 


1 



The next table contains more fields together with the main data needed for the 
computation of the Grank of WK 2 - Xa denotes the minimal polynomial of a 
over Q. 



F 


£ 




'62 

C^F{Ct)/ 


ra,nki(W K 2 {F)) 


Q(V-7307) 


5 


[5,25] 


[1] 


0 


Q(\/-356467) 


3 


[3,3,27] 


[3] 


1 


Q(a), Xa = — 365 


3 


[9] 


[9] 


1 


Q(c«), Xa = + x'^ — 133x — 1937 


3 


[3,3] 


[3] 


1 


Q(a), Xa = x^ + — 65a; + 1875 


3 


[3,3,3] 


[3,3] 


2 


Q(a), Xa = x^ + x"^ — 65a; + 1875 


3 


[3,3,3] 


[3,3] 


2 


Q(a), Xa = x^ + 9a;^ + 125 


3 


[3,3] 


[3] 


1 



Our last table gives examples of the 2-part of the wild kernel together with the 
generators of the cyclic factors. We made extensive use of the discrete logarithm 
in C£f ,2 in order to find small generators for it. 
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F 


C£f,2 


2-part of WK 2 {F) 




[2,2] 


Z/2Z{ l,i 2}©Z/2Z| 1, 


Q{i, VM) 


[2,2,2] 


Z/2Z{ 1,3}©Z/2Z| i/v^+2i*+2|^ 
Z/2Z| (i+4)>/3^+19i+76 1 


Q(i, V1173 


[2,2,2] 


Z/2Z{ 1,3}©Z/2Z| (4*+16)vTm+137*+548 |^ 

77 f 1 (-927i-3300)©TT73-31749i-13022 ) 


Q(Cs,v^) 


[4,4,4] 


Z/4Z {i, (2C| + 3C| + 2Cs)v^ - 80CI + 80Cs + 114} © 

77/|77 h- (15C^-K2C^+38C8-tl2)V56T-93C|-H12C^-330C8-372 'l ^ 






Z/4Z [i, (-CI + Cl - C8)y56T + 13CI - 28CI + 15Cs + 2} 
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Abstract. We propose an algorithm to compute the Frobenius polyno- 
mial of an ordinary non hyperelliptic curve of genus 3 over Fjv . The 
method is a generalization of Mestre’s AGM-algorithm for hyperelliptic 
curves and leads to a quasi quadratic time algorithm for point counting. 



The current methods for point counting on curves over finite fields of small 
characteristic rely essentially on a p-adic approach. They split up in three 
classes: those based on cohomology (Kedlaya), those based on deformation the- 
ory (Lauder) and those based on the canonical lift, proposed initially by Satoh. 
The AGM-algorithm developed by Mestre [Mes02] for elliptic curves belongs 
to this last category. It is an elegant and natural variant in characteristic 2 of 
Satoh’s one using, in analogy with the complex field, the machinery of theta func- 
tions. Mestre generalized it later to the hyperelliptic case and Lercier-Lubicz’s 
implementations of these algorithms are the current records for point counting 
in characteristic 2 [LL03]. 

In this paper, we propose a generalization of the AGM-algorithm to the genus 
3 non hyperelliptic case, as described in the author’s PhD thesis [Rit03]. Fast 
algorithms for point counting on non hyperelliptic curves exist in the literature. 
However, as far as we know, the present one is the first which is fast enough for 
problems of cryptographic size in characteristic 2. The aim of the paper is to 
give all the details for an implementation. With this end in view, we shall give 
an algorithm based on Weber’s work for the computation of the initial theta 
constants. This computation, which is fairly easy for hyperelliptic curves, re- 
quires some work for this case. The other issue is to find a good lift in order 
that the computations take place in the field of definition of the lift. It is the 
main achievement of this paper that it shows how find good lifts. Once this is 
achieved, we are able to use the optimized iteration process of [LL03]. We shall 
illustrate the method with an example over F 2100 . 
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1 The Principles of the AGM-algorithms 

A fundamental tool in p-adic methods is the canonical lift of an abelian variety. 
However, the canonical lift exists only for ordinary abelian varieties which form 
a dense open set of the moduli space. 

Definition 1. Let k he a field of characteristic p > 0. We say that an abelian 
variety A/k of dimension g is ordinary if |A[p](A:)| = p®. We say that a curve 
C/k is ordinary z/Jac(C) is ordinary. 

Let k = Vq with q = 2^ and let Qq be the unramified extension of Q 2 of 
degree N, its ring of integers, tt an uniformizer and a the Frobenius substi- 
tution. We consider Afk an ordinary principally polarized abelian variety. The 
key construction is the canonical lift [Mes72, Chap. V]. 

Theorem 1. There exists a unique (up to isomorphism) principally polarized 
abelian scheme over Spec{’Eq) (called the canonical lift of A) characterized 
by the following properties : 

1. its special fiber is isomorphic to A. 

2. EndQ^(A^) ~ Endfc(A) or equivalently there exists an isogeny 

Fr^ : A^ — >■ lifting the little Frobenius Fr : A — >■ 

The idea of AGM-algorithms is to produce sequences of 2-adic elements 
of which subsequences converge to some ‘invariants’ associated to A^. The 
formulas involved in the recurrence are a generalization of the Arithmetic 
Geometric Mean ((a -I- b)/2,y/ab) (see Formula (2)). They correspond to 
2-isogenies A^ — >■ Aj+i = Ai/Afi(2]^°'^ where Aq is a lift of A and Ai[2]^°‘‘ are 
the 2-torsion points in the kernel of the reduction. The convergence of the 
subsequences is then a consequence of a result of Carls [Car02]. 

When we are near enough (in 2-adic metric) to A^, the second step is to 
determine the Frobenius polynomial. One can prove that the determinant of the 
action of the lift of Frobenius on differentials is linked to a quotient of elements 
in the sequence (see Step 4 in Section 4) . The proof relies on the transformation 
formula for theta constants [Mum83]. On the other hand, this determinant is 
related up to a sign to the algebraic integer obtained as the product of the 
roots of the Frobenius polynomial which are 2-adic units. The last step is then 
to recover the minimal polynomial of this product (with LLL) and then the 
Frobenius polynomial. 

Let now A be the Jacobian of an ordinary curve C/k. One of the advantages 
of the AGM method is that the recurrence and the formula to compute the 
Frobenius are ‘universal’. In particular they are the same for hyperelliptic 
and non hyperelliptic curves of the same genus. Thus the specificities of the 
computation relies only on the initialization. The initial terms are p-adic 
algebraic analogues of (quotients of powers of) theta constants and no general 
formula seems to be known (at least in the non hyperelliptic case). In Section 
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2, we sum up the results of Riemann and Weber which lead to these formulas 
for the non hyperelliptic genus 3 case. 

For a actual implementation we need to keep all these computations in 
This leads us to find a good model for C and for its lift. In particular, for 
genus 3 non hyperelliptic case, we ask for the rationality of 28 specific lines (the 
bitangents of the lift of C). This is carried out in Section 3. 

2 Computation of the Initial Theta Constants 

In this section, we review some results of Riemann [Rie98] and Weber [Web76] 
about bitangents and the computation of theta constants. At the end of this 
section, we add a new result (Prop. 3) that is useful to our algorithm. 

Let K be an algebraically closed field of characteristic p > 0 and C/K a 
genus three non hyperelliptic curve. We suppose that C is embedded as a plane 
quartic and we denote xi,X 2 ,xs (or sometimes x,y,z) the coordinates in the 
projective plane. As C is canonically embedded, the canonical divisors Kq on C 
can be described as the intersection of the lines with C. We will center on these 
divisors, especially on the following ones : 

Definition 2. A line I is called a bitangent ofC if the intersection divisor {l-C) 
is of the form 2P + 2Q for some not necessarily distinct points P, Q of C. 



Remark 1. If there exists a line I with {I ■ C) = 4P, we call P a hyperflex. It is 
known that a generic quartic does not have any hyperflex. 

For the rest of this section, we suppose that the characteristic of K is 
different from 2 (the characteristic 2 case is postponed to Section 3.1). 

We denote A = {L G Pic^(C), = /C} the set of theta characteristic bundles 
(where K. is the canonical bundle). This set splits into two disjoint subsets A^, 
z = 0, 1, defined by Aj = {A G A, h°{L) := dim iL°(L) = z}. 

Proposition 1. There is a canonical bijection between the set of bitangents and 
Ai. 

Proof. Let I abitangent. We write (l-C) = 2{P+Q). We define L = L{P+Q). It 
is obvious that L G A and, since C is non hyperelliptic, Riemann-Roch theorem 
implies that hf{L) = 1. Therefore L G Ai. 

Conversely, if A G S\, there exists a regular section s G H^{L) different from 
0. Let D = (s). Since A = L{D) G Pic^(C), we have D ^ P + Q and since 
A G A, 2D ~ Kc. We conclude that D is the divisor of a bitangent and these 
two processes are inverse one of each other. 
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The set is a principal homogeneous space for Jac(C')[2] and if we choose a 
symplectic basis (cj, fi) of Jac(C)[2] (for the Weil paring) we can represent each 

2-torsion point as = X) + X) 

This choice also fixes a, Lq G S and so we have an explicit bijection between 
Jac(C')[2] and S. We denote by the associated theta characteristic bundle 

£■ 

with [e] = a 2 X 3 matrix called a characteristic (we refer to [GHOl] for a 
more general setting). 

We can also split the set of characteristics into two disjoint subsets. We say that 
a characteristic is even (resp. odd) if the scalar product e • e' = 0 (resp. 1). 



Proposition 2 ([GHOl]). The bijection between E and the set of characteris- 
tics restricts to a bijection between Ei and the set of odd characteristics. 



In particular, counting the number of odd characteristics, we recover the 
classical result on quartics : 



Corollary 1. A smooth plane quartic has exactly 28 bitangents. 



In order to understand the combinatorics of the bitangents, we introduce the 
following notion : 

Definition 3. Let S = a subset of characteristics. The subset is 

called a principal set if 

— every odd characteristic can be written as [cj] or [ej] -I- [e^j, i ^ j and 

— every even characteristic can be written as [0] or [ti] [cj] [ck] with i,j,k 
distinct. 



We may choose S as 




We denote (3i the bitangent associated to [e^j and (iij the bitangent associated 
to [ei] -I- [cj] by Propositions 1 and 2. 

Definition 4. The set (/3i)i=i,,,7 is called an Aronhold system. 



Remark 2. Geometrically, on Aronhold system is characterized by the fact that 
the points of contact of any three bitangents in the set are never on a conic. 
There are 288 Aronhold systems for a given quartic (see [Dol03]). 
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After a linear transformation, we may suppose 



{ (3\ xi = Q /?5 = aia;i + 02 X 2 + 03 X 3 

/32 : X2 = 0 Pe = a'^xi + 03X2 + 03X3 

Ps : X 3 = 0 Pr = a'(xi + 03X2 + 03X3 

Pi : xi + X2 + X3 = 0 

It was already known to Riemann how to construct a quartic for which the 
{Pi)i=i...7 are one of its Aronhold system. Recently, Caporaso and Sernesi in 
[CSOO] and Lehavi in [Leh02] proved that this quartic is uniquely defined by the 
set (Pi). So, Riemann’s construction below shows how to recover the curve from 
an Aronhold system and Propositions 3 and 4 show how to find an Aronhold 
system from C. Let us first recall Riemann’s construction under this point of 
view [Rie98]. 

Theorem 2 (Riemann). The curve C is isomorphic to the curve (which we 
call a Riemann modelj 



y/XiUi + A/X 2 U 2 + 1 /X 3 U 3 = 0 



where U\,U2,U3 are given by 



/ 

Ml + M 2 + M 3 + Xl + X 2 + X 3 = 0 

^ ^ ^ + ktt2X2 + fc03X3 = 0 

yr + ^ + ^ + k'a'iXi + k'a' 2 X 2 + k'a'^xs = 0 

AV + + ^ + k”a'lxi + k''a'(x2 + k''a'ix3 = 0 

V “2 



with k, k' , k" solutions of 



( 



J_ 

ai 

1 



1 




I- - 

\ 0.3 <^3 






/Aoi A'a'i A"<\ / k\ 
Ao 2 A'a'2 A"a" fc' 

yXasX'a'^y'a'f J \k" J 



Then we can express all the bitangents : 



Pi : xi = 0 P2 '■ X 2 = 0 Ps : X 3 = 0 
P23 : Ml = 0 Pi3 : U2 = 0 P12 : M3 = 0 




/?4 : xi + X 2 + X 3 = 0 P5 : oixi + 02X2 + 03 X 3 = 0 

Pe : a'lXi + 03X2 + 03X3 = 0 Pr '■ a'(xi + 03X2 + 03X3 = 0 

Pii : ui + X 2 + X 3 = 0 Pi5 : ^ + ko2X2 + ka^xz = 0 

/ 3 i 6 : (ff + A:' 03X2 + k'n'^xz = 0 Pn : 'fh + ^"03X2 + k''azX 3 = 0 

P24 : xi + U2 + X3 = 0 /?25 : kaiXi + ^ + ^03X3 = 0 

P2Q ■ k'a\xi + ^ + k'a'zXz = 0 P27 '■ k”a'(xi + ^ + k''azX3 = 0 

J- 02 'J 
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/S34 : xi + X2 + Us = 0 /?35 : kaixi + ka2X2 + ^ = 0 

ti3 

/?36 : k'a'iXi + k'a' 2 X 2 + ^ = 0 /?37 : k"a'{xi + k"a 2 X 2 + ^ = 0 



^ U2 

’ ai(l — fca2a3) ^ a2(l — fcasai) 

/346 : 



/?47 : 



(1 — 

ni I 



H2 

a '2 {l — k"a'^a'-^) 

j U2 



03(1 — ^ 0102 ) 

. ^3 

03(1 — fc^O^O^) 

^3 ^ 

O 3 {l — k"a'-^'a'2 



= 0 
= 0 



= 0 



1^56 • i-k"a''a" 

o . ui , u 

h>57 ■ i-k'a'^a'^ ■*" 1-k' 

/^67 : i_L'2a3 



_03 



= 0 
= 0 



U2 

l — ka^a\ 



1 — k' a'-^a'2 

^^3 = 0 

1 — fcoi O 2 



Following Riemann, Weber [Web76] performed computations over C to evalu- 
ate algebraically the theta constants and express them in terms of the bitangents. 
Recall that d[e](0, Q) is the theta constant defined (here with g = 3) by 






(0, Q) = ^ exp(i7r(n -I- e/2) 17 *(n -I- e/2) -|- 2i7r(n -I- e/2) *(e'/2)) 



nGZs 



where 17 is a Riemann matrix associated to (Jac(C),6>) and to the Aronhold 
system. 

Theorem 3 (Weber). Let [y] = [e^] -I- [cj] -I- [cfc] an even characteristic. Then 

f ^[x](0. -i2) y ^ \/3j , Pj , Pij I |/3^fc , f3jk , I |/3j , /3jfc , /3fc 1 1/3, , Pik , fik \ 

V -d[0](0, 17 ) ) \Pj,l3jk, f3ij\\f3i,P^k,f^^j\\P^, Pj,Pk\\P^k,f^jk,Pk\ 

where \Pi^, /3i^ \ = det{Pi^, Pi^. 

To complete the understanding of quartics, we have to show how to find an 
Aronhold system starting from a Riemann model. We give two solutions. The 
first (Prop. 3) is more useful for theoretical purposes and will be used in Section 
3.2. The second (Prop. 4) is more suited for explicit computations. 

Proposition 3. Let C : ^Jx\U\ + ^Jxpup + ^/xpvp = 0 be a smooth quartic. An 
Aronhold system for the curve is given by seven lines (xj)i=i... 7 , the four last 
being computed by the following algorithm : 

1. Let Di{\) be the determinant of the Hessian of the family of conics 

Ql(A) = A^(x 2'U3) + A(xi'Ui - X2U2 - X3U3) + {X3U2). 

2. We compute the resultant Ri{xi,X 2 ,X 3 ) = Res{Di,Qi,X)/X. 

3. Ln the same way, we compute R 2 {xi,X 2 ,X 3 ), relatively to 

Q2W = A^(xi'U3) + X(X2U2 - XiUi - X3U3) + {X3U1). 

4 . Then R= gcd{Ri,R 2 ) = Y\^^^ 3 Xi. 
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Proof. We only sketch the proof. For details see [Rit03, Sec. 3.5]. If I is a bi- 
tangent with intersection divisor 2 D , we denote Vl a section of L{D) G Hi. 
After a choice of a symplectic basis for Jac(C)[2], we may assign to each \/l a 
characteristic which we denote [s/l]. Then, we can find a symplectic basis such 
that the following decompositions holds (and we say for example that {x^) and 
(ua) are paired relatively to the first following decomposition) : 

[VTi] = [v^] + [v^] = [v^ + [v^] = [v^ + [V^] 

= [\/a^] + [\/ub] = [\/^ + [y/ufi] = [y/^ + [y/ur] 

= [y/^] + [v^] = [v^ + [v^] = [y/^ + [\A4] 

= [v^] + [\A4] = [v^] + [\/<] = + [\A4] 

We generalize the notion of theta characteristic bundle by considering bundles 
of degree 2 n such that Lf = /C”. For each n, this set is again a principal 
homogeneous space of Jac(C')[2] and we can associate to each bundle a unique 
characteristic [e]. We denote the corresponding bundle. Riemann-Roch 

theorem implies that 

{ 0 if n = 1 and [e] even 

1 if n = 1 and [e] odd 

3 if [e] = [0] and n = 2 

2n — 2 otherwise. 

In particular, for n = 2 and [e] yf [0], the dimension of is 2. A basis 

for is given by {^/xfus, y/xffuf). Hence, every s G may be written as 

s = Therefore 

= \^{X2U^) + A(a:iMi - X2U2 - X3U3) + (X3U2). 

Thus, the sections are parametrized by a family of conics which are tangent 
to C and which degenerate for 6 values of A (including A = 00 ) for which 
splits into two linear factors which are pair of bitangents with characteristics 
given by the decomposition of [v^]- Indeed, it is obvious that the family splits 

only when the conic is singular i.e. its Hessian matrix has a zero determinant. 

(2) 

We can proceed in the same way with s G and we obtain 

= A^(a;iM3) -I- \{x3U3 — X2U2 — a^iwi) -I- (X3U1). 

Looking at the decompositions of [y^] and [y/T^]) the bitangents common to 
both decompositions are X 3 ,U 3 (for resp. A = 0, A = 00 ) and {xi)i= 4 ,,,r- So the 
algorithm gives the product of the four last. 

The second construction is due to Aronhold. We refer to [Sal79] for details. If 
u = ax\ + bx2 + CX3, V = a'xi + b'x2 + c'x3 and w = a''x\ + b”x2 + d' X3 are 
three non concurrent lines, then we form the three conics 

X2V — X3W = 0, X3W — x\u = 0, x\u — X2V = 0, 
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which intersect in four points. We consider these intersection points as lines 
X 4 , ... , X7 in the dual plane. Then (xi)i=i ,,,7 form an Aronhold system for the 
quartic ^/xlUl + ^x^u^ + ^Jx^u^ = 0 with 




where is the adjoint matrix. 

Proposition 4. Let C : ^JxlUl + ^JX 2 U 2 + ^/x^ui = 0 be a smooth quartic. 
We denote U\ = {u 2 H U3)*, U 2 = (ms H mi)* and II 3 = (ui fl M2)* viewed as 
lines in the dual plane. Let (xi)i= 4,,,7 be the lines viewed in the dual plane as the 
intersection points of the conics xiUi — X 2 U 2 , X 2 U 2 — X 3 U 3 , X 3 U 3 — xiU\. Then 
the set is an Aronhold system for C. 

3 Model over K = ¥2 and Lift 

3.1 Ordinary Quartics in Characteristic 2 

Proposition 5. A smooth quartic C fK is ordinary if and only if it has 7 bi- 
tangents. 

Proof. By [Ser58], we may split Oq) into two parts P”'* © where the 

Frobenius F is an isomorphism on 1^“”' and nilpotent on By Serre duality, 
the dimension of P”'* is equal to the dimension of the space of regular exact 
differentials (i.e. differentials of the form df, f € K(C)). The dimension of P™* 
is equal to the order of the group of regular logarithmic differentials (i.e. differ- 
entials of the form df / f). Moreover this dimension is also the 2-rank 7^ of C. 

If f G K{C) then (df) = 2 Dq for some divisor Dq which does not depend on /. 
We call L{Do) the canonical theta characteristic bundle. By [SV87, Prop. 3.1] 
we know that the dimension hP {L{D q)) equals the dimension of regular exact 
differentials. So L{Dq) belongs to Fi if and only if 7^ < 2. 

According to [SV87, Prop. 3.3], there is a bijection between regular logarithmic 
differentials and non zero non canonical theta characteristic bundles which sends 
uj = {df / f) to L{ijjjT). In particular, every non zero non canonical theta charac- 
teristic bundle is in F\. 

Combining these two results, we see that the number of bitangents is 7, 4, 2 and 
1 when the 7^^ is respectively 3,2,1 and 0. In particular, C is ordinary if and 
only if C has 7 bitangents. 

Thanks to this proposition, it is easy to show [Rit03, Prop. 3.38] : 

Proposition 6. An ordinary quartic is isomorphic over ¥2 to the model (*) 
{ax^ + by"^ + cz^ + dxy + exz + fyzY = xyz{x + j/ + z) 
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with 



abc{a + b + d){a + c + e){b + c + f){a + b + c + d + e + f + 1) ^ 0. 

Conversely, every curve which satisfies this condition is an ordinary non hyper- 
elliptic curve of genus 3. 

Remark 3. One may find in [NR03] a complete classification of smooth quartics 
over a finite field of characteristic 2 using the work of [Wal95]. 

Note that, if C has all its bitangents defined over k (i.e. all the 2-torsion points 
of the Jacobian are rational) then C is isomorphic over fc to a model (*). 

3.2 Good Lift of the Model (*) 

Let k = Vq with q = 2^ and let Qq be the unramified extension of degree N 
of Q 2 with ring of integers Z^, tt a uniformizer and v the valuation such that 
v{tt) = 1 . 

In this section, we want to relate the model (*) in characteristic 2 to the Rie- 
mann model which is defined in characteristic 0. The reason is that we want to 
use the result on the bitangents of the Riemann model. Moreover we want that 
the 28 bitangents are defined over Qq. Of course, this is not true if we choose the 
lift arbitrarily. In order to do that, we would like to imitate the elliptic case where 
y'^ + yx = f{x) is lifted to = (2y -|- a;)^ = x"^ -\- 4/(x). But it is not clear how 
to do that on the model (*) and on the Riemann model. The trick is to study a 
cover of the Riemann model on which it is easy to add ’Artin-Schreier terms’ yx. 

Let C : y/x\Ui y/x2U2 + ^Jx^u^ = 0, over a field of characteristic 0. Consider 
the curve D 

{ Yf = xiui 

= X 2 U 2 
Y^ = X3U3 
Fi + Y2 + ^3 = 0 

The curve U is a genus 5 unramified cover of C. The cover is given by the map 

TT : {Yi : Y2 : Y3 : xi : X2 ■■ X3) (a;i : X2 : X3). 

If C/k : {ax^ by"^ cz^ dxy exz fyzY — xyz{x -I- y -I- 2 ) = 0 is given, 
we want to add Artin Schreier terms Ydi (with li linear in x, y, z) to the left 
members of the equations above in order that the new curve ZJ is a 2-cover of 
the curve C. This is just the result of some formal computations : 

Proposition 7 . C is equal to the quotient of 

{ Y^ -I- xYi = cx{by -I- (c -I- f)z) 

Y2 4 - yY2 = cy{ax dy {c e)z) 

Y^ -I- (a; -I- y)Y 3 = c{x y){ax -I- (d -I- 6)y -I- (1 -I- c -I- e -I- f)z) 
Yi+Y2+Y3 = cz 

by the map {Y\ : Y2 : Y3 : x : y : z) {x : y : z) 
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We lift identically the expressions over Qq except for the x + y term which is 
lifted to —2cz — x — y. Then we complete the squares on the left and since 
x + y+ (—2cz — x — y) = —2cz = — 2(Yi + ^2 + ^ 3 ), we obtain the genus 5 curve 



'{2Y^ + hf = h-{h+Avi) 
( 2 I 2 + ^2)^ = ^2 ■ (^2 + 4^2) 
(243 + = h • (^3 + 4^3) 

(2Yi + li) + { 2 Y 2 + I 2 ) 

. +{2Y3 + l3)=0 



with < 



h = X, h = y, I 3 = ~^cz - x- y 
Vi = bey + c(c + f)z 
V 2 = acx + dey + c(c + e)z 
V 3 = acx + (d + b)cy 
+c(l + c + e + f)z) 



The quotient curve is a model for C over and by a change of coordinates 
x = x\, y = X 2 and 2 : = —{x\ + X 2 + X 3 )/{ 2 c), we obtain the curve 



C/Qq : a/ xi {4vi +TQ + \/x2 {^V2+~W_ + a/ 3 : 3 (4^3 + 1 ^ = 0 

—ui —U2 



( 1 ) 



Theorem 4. The model (1) has all its bitangents defined over Qg. 

Proof. It is enough to show that there exists an Aronhold system which can be 
defined over Qg. We show this on the model preceding the change of coordinates 
by using the construction of Proposition 3. We have 



C : \/li(4vi + li) + -\/?2(4u2 + h) + ■\/^3(4r'3 + ^ 3 ) = 0 

=«1 =«2 =U3 



Call (/3i)i=i,..7 the bitangents of the Aronhold system. We may suppose that 
fdi = h, P 2 = I 2 and fd 3 = Is- Consider now the family of conics 



Qsi^) — I1U2}? + {I3U3 — U2I2 ~ uili)X + I2U1. 



If we denote I? 3 (A) the determinant of the Hessian of the family Q 3 , we denote 
P(A) = Z? 3 (A)/( 8 A). We want to study the roots Xi of this polynomial. To do 
that, we let /X = A + 1 and P{X) = (mod tt). By building up the Newton 
polygon of P{y), we see that all roots are congruent to 0 modulo tt, so A^ = 1 
(mod 7 t). Let A = — 1 + tt/x for some multiple y. Plugging this into P, we obtain 
P(A)/16 = (?yf{y + c)^ (mod tt). Hence two roots are of the form —1 — ttc 
(mod 7 T^) and the two others are of the form —1 (mod tt^). These two expressions 
show that 



<?3(Ai)/4 



c^{y + z){x + z) (mod tt) when A^ = — 1 — ttc (mod tt^) 

c?z{x + y + z) (mod 7 t) when Ai = —1 (mod tt^) 



We do the same computation with the family (5i(A) = M 3 ? 2 A^ + (ui^i — U 2 I 2 — 
U 3 ls)X + l 3 U 2 - We obtain the decomposition by pairs 

J (x + y + z)(y + z) (mod tt) when Xi = —1 — ttc (mod tt^) 

\c^z{x + z) (mod 7 t) when Ai = —1 (mod tt^) 



Qi(Ai)/4 
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and at last with the family Q 2 W = + (wi^i — U 2 I 2 — u^h)\ + I 3 U 2 ■ 



Q2(Ai)/4 



c^(x + y + z){x + z) (mod tt) when Aj = — 1 — ttc (mod tt^) 

c^z{y + z) (mod tt) when Aj = —1 (mod tt^) 



The bitangents /? 4 , /Js , /3e , /Sr must be in the three families but not in the same 
pair, as it appears on the decompositions in the proof of Proposition 3. Suppose 
(3i reduces to a; + y + z. Is it possible that (3^ also reduces to a; + y + z ? The 
family Q 3 shows then that neither (3^ nor /Sy reduces to z. The family Q 2 leads 
to y + z is the reduction of (3q or (3-j. But in this case, Q\ shows that f3^ and (/3g 
or /3y) or j3^ and {(3^ or (3^) are in the same pair : excluded. 

Therefore, the bitangents reduce to four distinct lines. The algorithm of Propo- 
sition 3 constructs a homogeneous polynomial of degree 4 with coefficients in Qg 
having the (/3i)i=4.,,7 as linear factors. Now we know that, modulo tt, the poly- 
nomial splits into distinct factors. By a version of Hensel’s lemma with several 
variables, we can conclude that the {(3i) are defined over Q^. 



Remark 4- It is easy to find the kernel of the reduction : it is generated by the 
divisor {{ 1 ) — (Z'))/ 2 ' for two bitangents I, I' which reduce to the same bitangent. 
In particular, if we choose the following reduction for the Aronhold system 





h 




Pi 


P5 


Pe 


Pr 


X 


y 


x + y 


X + y + z 


X + z 


z 


y + z 



and the principal set of the example, then the kernel is generated by 

[ei] + [£ 2 ] + [ea] = (^1 q o) ’ ^ ^ (o 1 o) ’ ^ ^ (o 0 l) ' 

By [D 0 IO 3 , th. 6.19], to specify an ordered Aronhold system is equivalent to give 
a symplectic basis of Jac(C)[2] for the Weil pairing. Thus, it determines the 
‘local part’ of the AGM sequence (see Section 4 Step 2). 



4 The Algorithm 

Let C/k be a genus g ordinary curve. We assume that its Jacobian is absolutely 
simple and has all its 2® points of order 2 defined over k. We give here the 
general AGM-algorithm with specific details concerning the non hyperelliptic 
genus 3 case. Several different proofs and implementations can be found for 
elliptic curves ([Gau02], [Koh03], [Rit03, Sec. 2.1], [Ver03]). For hyperelliptic 
curves we refer to [LL03]. For the general case we refer to [Rit03, Part. 3]. 

1. We lift the curve C to Q^. For non hyperelliptic curves of genus 3 we consider 
C/k given by (*) and we lift it as in Section 3.2. 
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2. We compute (in terms of the coefficients of C) 2® constants which are square 
roots of the algebraic expressions of the corresponding complex quotients (for 
a certain Riemann matrix Q): 



/ 

V 



d 


e 

e' 


(0,12) 


d 


'o' 

0 


(0,f2) 



4 



e,£'e(Z/2Z)s 



Jac(C') is a principally polarized abelian variety. As C is ordinary, one can 
prove that Jac(C)[2]*°'^ is a maximal isotropic F 2 -vector space for the Weil 
pairing and therefore it is always possible to find a symplectic basis of 
Jac(C)[2] such that the 2® quotients we need to build the sequence are those 
with e = 0 (for the genus 3 non hyperelliptic case, we have showed how to 
find this basis in Remark 4). We denote these quotients by (di°^)eg(z/2Z)9 • 
The computation of (di*^^)eg(z/2Z)3 for genus 3 non hyperelliptic curves is 
given in Section 2. Note that it uses ‘ternary invariants’ (bitangents) whereas 
in the hyperelliptic case the computation is based on Thomae formula which 
involves only ‘binary invariants’ (Weierstrass points) [Fay73]. 

3. We build a sequence by using formulas which are a generalization of the 
Arithmetic Geometric Mean. In complex setting, they are 



-d 




( 0 , 212)2 



W E ^ 

2 ® ^ 

d^{TLITL)o 



0 

e' + d 



(0,f?)-i? 



0 

d 



( 0 , 12 ). 



In 2-adic context, these formulas are replaced by 



^ e 



1 



E 






\ 



q(0 

G+/ 



(2) 



The square root of an element a; G 1 + SZ^ is chosen to be congruent to 1 
modulo 4. 

4. Let TTi be the roots of the Frobenius polynomial that are 2-adic units (there 

are exactly g such roots since C is ordinary). The fundamental proposition 
of the AGM theory is that the quotient converges linearly to 

a = ±7Tl . . . 7Tg . 

5. We compute the minimal polynomial oi f3 = a + 2®^/a. For genus 1 (resp. 
2), this polynomial is essentially given by (3 [Mes02] thus it is enough to 
know this number with a precision of ?>N/2 (resp. 2N) bits. This is no longer 
true for the genus 3 case and Mestre proposed to use LLL to recover this 
polynomial. We need now to know (3 with precision lOA^ bits. Then, it is 
easy to find the characteristic polynomial of Frobenius up to a sign problem 
which we solve with fast addition in the Jacobian. This last step is illustrated 
in the following example. 
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Remark 5. The complexity of the algorithm is the same as in the hyperelliptic 
case and it depends essentially on the lift phase. Following [LL03], the complexity 
in time is O(Ai^^logfV) where 0 {N'^) is the number of bit operations for the 
multiplication of two N bit length integers. With Karatsuba /i = log2(3), with 
FFT multiplication /i = 1 + e. The complexity in space is O(iV^). 



5 Example 

Let C over k = F^, q = 2^ with N = 100, be defined by 

{ux^ + {up + l)ip + up + uj'^xy + {u)^ + uP)xzc + uPyz)'^ — xyz{x + y + z) = 0 

where w is a root of — 1)/{X — 1). The choice of this generator is due to 

the use of a Gaussian normal basis in [LL03] . Note that cryptographic sizes are 
about N = 60. The following computations are realized on a 731 MHz DEC 
alpha computer using C libraries and MAGMA 2.10. 



We compute the V\,V 2 , fa from the proof of Proposition 7 over an unramified 
extension of degree 100 of Q 2 : 

{ v\= {w^ + wpy + {w'^ + w^)z 

V2 = w^x + w^y + {w^ + TTwpz 

V 3 = w^x + (w® + w® + wpy + (w® + If® + 7Tw^ + wpz 



Then we compute the Aronhold system by Proposition 4 and all the bitangents 
by Theorem 2. Note that Theorem 2 requires to make a change of variables to 
send the line X 4 of Proposition A to x\ + X 2 + X 3 . We obtain the initial theta 
constants (t?e°)es(z/2Z)3 by Theorem 3 (time : less than 1 minute) 
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We use Lercier-Lubicz’s doubling algorithm [LL03] which is shared by the hyper- 
elliptic and non hyperelliptic cases. It is a optimized computation of the formula 
(2). We find (time : 1 minute) 



^( 1000 ) 
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-t 0(2“°°) 
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We compute the minimal polynomial of /3 = a + 2^^ /a with an adapted ver- 
sion of LLL (see [LL03] or [Rit03, sec. 4.2] for details, time : 2 seconds). It is 
generically a polynomial of degree 4. 

f’sym=V‘‘ - 26715365673094521954519391602467084378059297 ■ 

-1982428428843079057759234710708713623379241158540353684707741 ■ 2^*^*^ - + 

248550842369414498721759586691744404695893277906335586587728273970726713469 ■ 2^°° ■ X + 
175058029257348169113298037630983472240590555014956076285407283685037701812736379942660161 ■ 2^*^*^ 

The formulas linking Psym and Xc following : if Xc = ~ ~ 

cX^ + bqX"^ — aq^X + q^ then 

Psym = X^- ciX^ + q{h\ - 2aiCi - 2q{al - 2hi))X‘^ 

-q^ci{a\ - 2bi - 8q)X + -I- qai{af - 4aibi + 8ci)) 

with ai = a,bi = b — 8q, ci = c — 2qa. 

Therefore, we recover the Frobenius polynomial up to a sign (i.e. X(^(±X)) 
because a determines f][ only up to a sign. This step requires only one fourth 
root in Z. 

Xc = X^ + 377276036264709 • X® -t 3455351061169045838894227937403 • X^ + 
929793021972276691307766666464616872277691871 • X® 
-t3455351061169045838894227937403 • 2^°“ • 

-t377276036264709 • 2^°° • X -t 2®°°. 

The determination of the sign can be done by answering the question : 
? 

X(j(l) ■ D 0 where P is a generic degree 0 divisor. With the algorithms de- 
veloped in [FOR03], we can prove that the present Xc correct sign in 4 

seconds. 

Remark 6. With the same ideas, we managed to compute the characteristic 
polynomial of a quartic over F 25002 in two weeks (the result may be found on 
http : //www . math . jussieu . f r/~ritzenth) . 

We may choose special forms for the model (*) in order that the sign problem 
is more obvious. For example, the curves 

C : {ax + by + cz)'^ — xyz{x + y + z) = Q 

with a,b,c non zero and distinct. On this model, we have hyperflexes 
Pi, P 2 , P 3 , Poo, respectively intersections of C with x = 0,y = 0,x + y + z = 0 
and z = 0. The divisors Pi = Pi — Poo and P 2 = P 2 — Poo are linearly in- 
dependent (but Pi — Poo -I- P 2 — Poo + P 3 — Poo ~ {{ax + by + cz)/z)). Now 
Jac(C') [4](/c) ~ (Z/4Z)^ and the action of (() on < Pi,P 2 > is trivial since they 
are rational divisors. So at least two roots of xc are congruent to 1 modulo 
4. If we change the sign at most one would be. So to decide the sign, we have 
only to check the 2-adic valuations of the roots of xc{X + 1). Note that since 
these models have a hyperflex, they are C 34 curves for which the algorithms of 
[FOR03] for addition in the Jacobian are even better. 
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Conclusion 

We have described a generalization of the AGM-algorithm of Mestre for non hy- 
perelliptic curves of genus 3. For higher genus, there are several difficulties : first, 
it seems to be quite difficult to find in general (i.e. for a non hyperelliptic curve) 
an algebraic expression of the initial theta constants. Moreover the complexity 
of the algorithm grows exponentially with the genus and owing to the use of 
LLL, the constant is quite bad (even for g = 3). A first improvement would be 
to find not only the product of eigenvalues of Frobenius but the whole matrix of 
the action. A possible approach is maybe to find suitable bases for the invariant 
differentials of Mumford equations (see [Mum83]). 
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Abstract. This paper presents an investigative account of arbitrary cu- 
bic function fields. We present an elementary classification of the signa- 
ture of a cubic extension of a rational function field of finite characteristic 
at least five; the signature can be determined solely from the coefficients 
of the defining curve. We go on to study such extensions from an algo- 
rithmic perspective, presenting efficient arithmetic of reduced ideals in 
the maximal order as well as algorithms for computing the fundamental 
unit(s) and the regulator of the extension. 



1 Introduction 

The arithmetic of algebraic curves over finite fields is a subject of considerable 
interest, due to its mathematical importance as well as its applications to cryp- 
tography. Since general-purpose methods tend to be computationally inefficient, 
the discussion of fast algorithms has so far predominantly focused on elliptic and 
hyperelliptic curves. In addition, the arithmetic of purely cubic curves = D{x) 
has been investigated in considerable detail [12,10,11,2]; Picard curves represent 
a special case thereof. Several other particular classes of of curves have also been 
studied from an algorithmic point of view, such as superelliptic curves (curves 
of the form j/” = D{x)) [5] and Cab curves [1]. 

In this paper, we investigate arbitrary cubic extensions of a rational function 
field of finite characteristic. We give a simple technique for finding the signature 
(and thus the unit rank) of a cubic extension when the characteristic is at least 
five; the signature can be determined solely from the coefficients of the defining 
curve. We also investigate efficient arithmetic of reduced fractional ideals in the 
maximal order of the field and show how to use this arithmetic to find the 
fundamental unit(s) and the regulator of the extension. Our method is based on 
a procedure that was originally developed by Voronoi for cubic number fields 
[14] and was recently adapted to purely cubic function fields [12,6]. 
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2 Cubic Function Fields and Curves 



Let fc = Fg be a finite field of characteristic not equal to 3, and denote by k 
its algebraic closure. Consider an absolutely irreducible nonsingular affine plane 
curve Co defined by an equation H{x,y) = 0 where H € fc[x][y] is a bivariate 
polynomial of degree 3 in 1" which is irreducible over k(x); write H(x,Y) = 
SY^ + UY"^ + VY + W with S,U,V,W G k[x], SW yf 0. Then the function 
field K of Co over fc is a cubic extension of the rational function field k{x) with 
minimal polynomial H{x,Y); that is, K = k{x,y). 

It is easy to verify that under the transformation (x, y) — >■ (x, S~^(y— U /3)), 
Co is birationally equivalent to the curve Ci ■ y^ — Ay + B = 0 where 




B = S^W - 



SUV 

3 



2C/3 

~W' 



2 

Furthermore, the singular points on Ci are exactly the points (a, [/(a)/3) € k 
where S{a) = 0. If divides A and divides B for some Q G k[x\, then 
all points of the form (a,0) with Q{a) = 0 are singular, and Ci is birationally 
equivalent to the curve y^ — {A/Q'^)y + (B/Q^) = 0. For brevity, we call a 
(possibly singular) curve C a standard model for K/k{x) if C is of the form 
y^ — Ay B = Q with A,B G k[x], B ^ 0, and for no Q G k[x] does divide 
A and divide B. We also say that such a curve, and its function field, are 
in standard form. Clearly, every absolutely irreducible nonsingular affine plane 
curve over k of degree 3 in y is birationally equivalent to a standard model. 

A standard model is purely cubic if A = 0. Note that if g = 1 (mod 3) — 
this can always be accomplished by adjoining a primitive cube root of unity to k 
if necessary — then by Kummer theory, a cubic extension K /k(x) has a purely 
cubic model if and only if it is a Galois extension (see Lemma 2.1 of [6]). If 
q = —1 (mod 3), it is not clear which cubic extensions over the field Fq(x) have 
purely cubic representations. 

For a curve y^ — Ay + B = 0 in standard form with function field K, the 
polynomial / = f{Y) = Y^ — AY + B G fc[x][T] is the minimal polynomial 
of K/k{x). It has three distinct roots yo = y,y\ = y^y 2 = y" in an algebraic 
extension of k{x) of degree at most 6. For any a = a + by + cy^ G K with 
a,b,c G k{x), denote by a' = a + by' + c(y')^ and a" = a + by" + c{y")'^ the 
conjugates of a. The norm of a is N{a) = aa'a" G k(x) and the trace of a is 
Tr{a) = a+a' +a" ; both are rational functions (i.e. in k{x)). The discriminant of 
/ is the nonzero polynomial D = {y—y'Y{y'—y"Y{y"—yY = 4A^— 27B^ G k[x\. 
We recall that if k has odd characteristic, then K/k{x) is a Galois extension if 
and only if Z? is a square; in particular, a purely cubic extension is Galois if and 
only if g = 1 (mod 3) . 

We have the following simple characterization of singular points: 

Lemma 2.1. Let C : y^ — Ay + B = 0 be a standard model of a cubic extension 
K/k{x) where k has characteristic at least 5. Set D = 4A^ — 27B^, and let 
a G k. Then (a, b) is a singular point of C for some b G k if and only if D{a) = 
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^(a) = 0 and either B{a) ^ 0 or B{a) = (a) =0. In the latter case, we 

have b = A{a) = ^^{a) = 0. 

2 

Proof, {a, b) G k is a singular point of C if and only if 



P — A{a)b + B{a) = 0, 


(2.1) 


36^ - A{a) = 0, 


(2.2) 


dA, , dB , , 




-(«)(,-— (a) = 0. 


(2.3) 


Suppose (2.1) - (2.3) hold, then A{a) = 36^ and B{a) = 26^, 


D{a) = 0, and 



^(a) = 54_B(a) (fy(a)&— §f(a)) = 0- Furthermore, if B{a) = 0, then 6=0, 
so A{a) = 0. In this case, (2.3) yields §f (a) = 0, and hence = 0. 

Conversely, suppose that D{a) = = 0 and either B{a) yf 0 or B{a) = 

^(a) = 0. Then 4A(a)^ = 27 B{aY, so there exists b G k with A{a) = 36^ 
and B{a) = 26^. It follows that (2.1) and (2.2) hold. Now ^(a) = 0 implies 

Oj then (2.3) holds, and if B{a) = 
^(a) = 0, then 6 = 0 and (2.3) holds as well; furthermore, in the latter case, 
A{a) = 0 and ^(a) =0. □ 

For G,P G k[x], let vp{G) denote the maximal power of P dividing G. By 
Lemma 2.1, the curve C is nonsingular if and only if vp{D) > 2 implies vp{B) = 1 
for every irreducible divisor P G k[x\ oi D. This implies the following: 

Corollary 2.2. LetC : — Ay+B = 0 be a standard model of a cubic extension 

K/k{x) where k has characteristic at least 5. Set D = AA^ — 27B^. Then C is 
nonsingular if and only gcd{D,B) is squarefree. 

If A is the discriminant of K/k(x) (unique up to nonzero constant square 
factors), then there exist I G k[x] (the index or conductor of y) such that D = 
Pa. The curve C is nonsingular if and only ii I G k* = k \ {0}, i.e. if and only 
if D and A agree up to a square factor in k. Using a result due to Llorente and 
Nart (see Theorem 2 of [8]) that is is readily extendable from cubic number fields 
to their function field analogue, one can easily compute A and I from D: 

Lemma 2.3. Let C : y^ — Ay + B = 0 be a standard model of a cubic extension 
K/k{x) where k has characteristic different from 3. If A is the discriminant of 
K/k{x) and P G k[x] is any irreducible divisor of D = AA^ — 27B^, then 

• vp{A) = 2 if and only if vp{A) > vp{B) > 1; 

• vp{A) = 1 if and only if vp{D) is odd; 

• vp{A) = 0 otherwise, i.e. if and only if vp{D) is even and 

vp{A) = vp{B) = 0. 

The characterization of the “otherwise” case stems from the condition vq (A) > 2 
forcing vq{B) < 2 for all Q G k[x]. The same condition implies in the case where 

vp{A) = 2 for any P \ D that 1 < vp{B) < 2. Note also that if vp{D) is odd, 

then either vp{A) = vp{B) = 0 or 1 = vp{A) < vp{B). 
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3 Integral Bases 

Let f{x,y) = 0 with / = f(Y) = — AY + B G k[x,Y] be the standard 

model of an affine plane curve defining a cubic function field K = k{x,y). As 
before, let D = — 21 = PA where I € k[x] is the index of y and Z\ is the 

discriminant of K/k{x). The integral closure of k[x] in K is the ring of regular 
functions or maximal order of K/k{x) and is denoted by O. O is a fc[x]-module 
of rank 3, and any fc[x]-basis of O is called an integral basis of K/k{x). 

Every nonzero ideal a in O is a fc[a;] -submodule of O of rank 3; write a = 
[A, (/), V'] where {A, V'} is any /c[a;]-basis of a. The norm N{a) is the (finite) 

group index [O : a]; it is a nonzero constant multiple of the determinant of the 3 
by 3 transformation matrix with polynomial entries that maps any integral basis 
to any /e[a;]-basis of a. The discriminant of o is A\(a) = -/V(a)^Z\; it is unique up 
to nonzero constant factors. We have A(0) = A, and since y, y^]) = D = 
Pa, the norm of the ideal k[x,y] = is a constant multiple of I. 

Our goal is to find an integral basis of K/kfx) that is suitable for compu- 
tation. Voronoi (see [4, pp. 108-112]) first proposed how to do this for cubic 
number fields. 

Lemma 3.1. For any integral basis of K of the form {1, 4>, where 4> = y + S 
and = (y^ + Ty+U)/I with S,T,U G k[x], we have — A = 0 (mod I) and 
T^-AT + B = 0 (mod P). 

Proof. Since 4>ip,ip‘^ G O, there must exist r, s,t,u,v,w € k[x] such that (pxj} = 
r-ip + s(j) + 1 and = wp + vp + w. An easy but tedious calculation reveals that 
s = {T^ - U - A)/I, u = (T2 + 2U + A)/I, and v = {AT - - B)/P. So 

2s + u= {3T^ -A)/I G k[x] and -v = {T^ - AT + B)/P G k[x]. □ 

It is clear that a basis of the form described in Lemma 3.1 — and hence a 
polynomial T with 3T^ — A = 0 (mod I) and — AT + B = D (mod P) — 
always exists. 

Corollary 3.2. Let T G k[x\ with 3T^ — A = 0 (mod I) and — AT + B = 
0 (mod I^). Then the set {l,p,uj} with 

p = y-T, w= y(y^-kTy-kT^-A) 
is an integral basis of K/k{x) with puj G k[x\. 

Proof. Let 3T^ — A = El and T^ — AT + B = FP with E,F G k[x]. We have 
p^ + 3Tp'^ + EIp + El"^ = 0 and uj^ — Ep + 3FIT — T^/ = 0, so y and co are 
integral over fc[x] and hence lie in O. Now [1, y, Iw] = [1, y, y^j, so /^Z\([l, y, wj) = 
A{[1, p, lu]) = A([l,y,y^j) = D = PA and hence Z\([l,y, wj) = A = A{0). It 
follows that [l,y, w] = O. Finally, put = —FI G k[x\. □ 

Note that we can always choose T so that deg(T) < deg(I), in which case 
the above basis is polynomially bounded in the size of the coefficients A, B oi 
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the standard model. We call a basis of the type given in Corollary 3.2 such that 
deg(T) < deg(/) a canonical basis of K/k{x). The following identities are easily 
verified and show that canonical bases are indeed very suitable for computation. 
Here, - A = El and - AT + B = FP with S, F G k[x\. 

p^ = Iuj- 3Tp - El, Tr{p) = -3T, N{p) = -FP, 

P = Etc - Fp- SET, Trp) = E, N{w) = F^I, 

p(jj = —EE 

Note that when our curve is nonsingular, i.e. I G k*, then we may take 1=1 
and T = 0, in which case E = —A, F = B, p = y, and w = — A. If K/k{x) 

is purely cubic, then A = 0, so we may once again take T = 0. In this case 
F = 0, so I is the square part and F the squarefree part of B. Here, p = y and 

= vVi- 

4 Signatures 

In order to determine the behavior at infinity of a cubic function field extension 
(which in turn will reveal the signature), we first require some notation and a 
simple lemma. For any finite field F^, we denote by the field of Laurent 

series in (e G N); note that k{x~^) is the completion with respect to the 

infinite place of Fg(cc). If a = is any nonzero element in Fg(x“^/®) 

with m G Z, Oi G Fq for i < m, and am yf 0, then sgn(a) = am is the sign, 
deg(of) = m the degree (in x~^), and |o;| = absolute value of a. 

The following simple lemma will prove useful. 

Lemma 4.1. Let q be any prime power, p a prime not dividing q, and a a 
nonzero element in Fg(x“^). If deg{a) is divisible by p, then a has a p-th root in 
Fq(sgn(a)^/^’)(a;“^), otherwise a has a p-th root in Wq{x~~^P) , but in no subfield 
of Laurent series of¥g{x~^P). 

Proof Let (3 = bp G F,(x“^). Then (3^ = YhZ-oo where Cp„ = 

and for z G N, Cpn-i = pblf~^b„-i + fi where ft is a homogeneous polynomial 
of degree p in bn-i+i,bn-i+ 2 , ■ ■ ■ jbn with coefficients in Fg. In particular, if 
(3P G Fg(cc“^), i.e. Ci G Fg for i < pn, then inductively, bi G Fg(6„) for i = 
n,n — l,n — 2 , . . . 

Now let a G Fg(x“^) and write deg(a) = pn r with 0 < r < p — 1. Set 
7 = x~^a, so 7 G Fg(x“^) with deg( 7 ) = pn. Write 7 = X^t-oo bn be 

any p-th root of Cp„ and recursively define b„-i = {cpn-i — fi) /pblp^ G Fg(6„) for 
z G N, where fi is the polynomial in bn-i+ 2 , ■■ - bn described above. If we 

set [3 = Xr=-oo bp, then f3 G Fg( 6 „)(x“^) and P = 7 . Therefore a = {pPpp . 
If r = 0, then a has a p-th root in Fg(sgn(a)^/P)(a:“^), otherwise the smallest 
field of Laurent series containing ap-th root of a is F = Fg(&„)(a^~^)(a:’'/^). Since 
r is coprime to p, we have L = Fg(x“^)(a;^/^). Clearly L C Fg(x’ and since 
both fields are extensions of degree p of Fg(a;“^), they must be equal. □ 
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If F is any finite algebraic extension of Fg(a;) of degree n, then the place at 
infinity of F^(a;) splits in F as 

(oo) (4-1) 

where s G N, and for 1 < i < s, pj is a place of F of residue degree G N 
and ramification index G N with Then the completion of F 

with respect to the place pj is Fp. = If we sort the pairs (ei,fi), 

I < i < s, in lexicographical order, then the 2s-tuple (ei, /i, 62 , / 2 , • ■ • , Cg, /«) is 
the signature of F/F^(a;). 

We are now ready to determine the signature of a cubic extension. Note that 
transforming such an extension into standard form as described in Section 1 does 
not affect the signature. 

Theorem 4.2. Let C : f{x,y) = 0 with f{Y) = — AY + i? G A:[x][F] he a 

standard model of a cubic extension K/k{x) where k = Vq is a finite field of 
characteristic at least 5. Set D = — 27 Then K/k{x) has signature 

. ( 1 , 1 , 1 , 1 , 1 , 1 ) if 

o |Gl|^ > |i?p, deg(T) even, and sgn(T) is a square in k, or 
o |T|^ < |i?p, deg(F) = 0 (mod 3), sgn(i?) is a cube in k, and q = 
1 (mod 3), or 

o |T|^ = |i?p, 4sgn(T)^ yf 27sgn(F)^, and the equation t^ — sgn(T) t + 
sgn(i3) = 0 has three roots in k, or 

o 4sgn(Gl)^ = 27sgn(S)^, deg(Z?) is even, and sgn{D) is a 

square in k; 

• ( 1 , 1 , 1 , 2 ) if 

o |ylp > |i?p, deg(^) even, and sgn(Gl) is not a square in k, or 
o < |i?p, deg(F) = 0 (mod 3), sgn(i?) is a cube in k, and q = 
— 1 (mod 3), or 

o |7l|^ = |i?p, 4sgn(T)^ 27sgn(F)^, and the equation t^ — sgn(A) t + 
sgn(F) = 0 has one root in k, or 

o = |i?p, 4sgn(T)^ = 27sgn(F)^, deg(Z?) is even, and sgn(F) is not 
a square in k; 

• (1,3) if 

o < |i?p, deg(F) = 0 (mod 3), and sgn(B) is not a cube in k, or 
o jylj^ = |i?p, 4sgn(T)^ yf 27sgn(F)^, and the equation t^ — sgn(Gl) t + 
sgn(F) = 0 has no roots in k; 

• ( 1 , 1 , 2 , 1 ) if 

o > |i?p and deg(T) is odd, or 

o |Tp = ji?p and deg{D) is odd (so 4sgn(T)^ = 27sgn(F)^^; 

• (3, 1) if |Gl|^ < |i?p and deg{B) ^ 0 (mod 3). 

Proof. Let I = max{|’deg(Gl)/2] , |"deg(i3)/3]} and consider the polynomial 
foo{Y) = x~^^ f{x,Yx^) = Y^ — A{x)x~‘^^Y + B{x)x~^^ G k[x~^,Y]. If 



foo{Y) = Pi{YY^P2{Yf^---Ps{Yf‘ 
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is the factorization of foo{Y) into powers of distinct monic irreducible polyno- 
mials in k{x~^)[Y], where s G N, ei,e 2 ,..., 6 s G Z-°, and Pi has degree fi in 
Y for 1 < t < s, then with the proper ordering, (ei, /i, 62 , / 2 , • ■ • , e^, /s) is the 
signature of K/k{x). 

Clearly, 9 = 9{x) is a root of fao(Y) = 0 if and only if 6x^ is a root of f{Y) = 0. 
Hence, in order to determine the signature of K/k{x), it suffices to find for each 
zero a of / minimal positive integers e and / such that a G 

If k has characteristic at least 5, then the the zeros {y,y',y"} = { 2 /o: 1 / 1 j 2 / 2 } 
of / are given by Cardano’s formulae 

yi=^{u^S+ + u~'-d_) (z = 0,1,2), (4.2) 

where m is a primitive cube root of unity and (5+ = j/o + u^yi + uy 2 , 5- = 
yo + uyi + u^y 2 - Here 

^+= ^-^(95 + 7 =^), S.= (4.3) 

where the cube roots are taken so that J+d- = 3H (note that this leaves three 
choices for the cube root of 5+, but different choices for this cube root only lead 
to a different ordering of the roots yo,yi,y 2 )- 

For brevity, set m = deg(H), n = deg{B), a = sgn(a), and b = sgn(H). 

Case < \B\'^: By Lemma 4.1, G k{x~^), implying G k{x~^) 

and hence again by Lemma 4.1, <5+,<5_ G k{x~^^^). From (4.3), |<5+| = > 

|i5_|, sgn(5+) = — 3&^/^ for some cube root 6^/^ of b, and for z = 0, 1, 2: |j/j| = 
and sgn(z/j) = — zz*6^/^ from (4.2). 

If rz ^ 0 (mod 3), then yi G k{x~^^^)\k{x~^) for z = 0, 1, 2, so K/k{x) has sig- 
nature (3, 1), whereas if zz = 0 (mod 3), then by Lemma 4.1, z/j G k{b^/^, u){x~^) 
for z = 0,1,2, so any ramification index Cj in the signature must be 1. In 
this case, if b is not a cube in k, then q = 1 (mod 3) (as otherwise, ev- 
ery element in A: is a cube), so zz G k, [fc(6^/^,zz) : fc] = 3, and the signa- 
ture is (1,3). On the other hand, if 6 is a cube in k, then yo G k{x~^) and 
yi,y 2 G k{u){x~^). Hence, if g = 1 (mod 3), or equivalently, u € k, then the 
signature is (1, 1, 1, 1, 1, 1), whereas if g = — 1 (mod 3), then [fc(zz) : k] = 2, and 
since sgn(z/i), sgn(z/ 2 ) G k{u) \ k, the signature must be (1, 1, 1,2). 

Case > |Hp: Here, G |<5+p, |<5_p = g3™/2 and sgn(z5i^) = 

— sgn((5i) = (— 3a)^/^. By Lemma 4.1, 5+,5_ G so z/i G k{x~^^'^) for 

z = 0, 1, 2. Choose the cube root of i5+ so that sgn(<5+) = — sgn(<5_) = (— 3a)^/^. 
Then |yi| = |z/ 2 | = g^^^ > |yol- 

If rzz is odd, then by Lemma 4.1, z/i,z /2 G k{x~^^‘^) \ k{x~^), so at least 
one of the ramification indices in the signature is 2, forcing signature (1, 1, 2, 1). 
Suppose now that rzz is even, then <5+,<5_ G /c((— 3a)^/^)(x“^). Write (5+ = f3 + js 
with £ k{x~^) and s^ = —3a. Since <5+ = = (i5_)^ where the map 

: fc(s) k{s) takes s to — s, we have <5_ = u^S+ = — js) for some 

j G {0, 1,2}. Since 3A = 5_|_(5_ = zz^(/3^ — 3aj^), we have G k. If j = 0, then 
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it is simple to deduce yo £ k{x~^) and j/1,2/2 G k{a}/'^){x~^) . If j ^ 0, then we 
must have q=l (mod 3), so u £ k, and it is once again easy to see that exactly 
one among yo,yi,V 2 is in k{x~^), while the other two are in k{a^^^){x~^). In 
any case, this yields signature (1, 1, 1, 1, 1, 1) if a is a square in k and (1, 1, 1, 2) 
otherwise. 

Case = |i?p: Then for some j £ N. 

Assume first that deg(D) is even. Then <5^ and (5i are Laurent series in x~^ 
of degree 3j with coefficients in k. By Lemma 4.1, <5+, i5_ £ k{x~^), so the same 
holds for yo,yi,y 2 - Furthermore, at least two among these roots have degree j, 
and since |yo?/iJ/2| = \B\ = they all have degree j. 

Suppose first that 4a® yf 27b^. Let y G {yo,yi,y 2 } and set s = sgn(y). 
Then s® — as + 6 = 0. We note that 3s^ yf a, as otherwise & = as — s® = 2s®, 
implying 4a® = 27&^. For f G N, let sj-i be the coefficient of x^~'^ in y. By 
considering the coefficient of x®-’“* in the equation y® — Ay + B = 0, we see 
that Sj-i = (3s^ — a)~^g where is a linear combination of products involving 
the coefficients of A and B as well as Sj-i+i, Sj-i+ 2 , ■ ■ ■ , Sj-i, s {i £ N). A 
simple induction argument thus shows that sj-i £ k{s) for all i G N. Hence, 
yi £ k(sgn(yi))(x~^} where sgn(yi) is a root of the equation C — at + b = 0 
for i = 0,1,2. This equation has 0, 1, or 3 distinct roots, yielding respective 
signatures (1, 3), (1, 1, 1, 2), and (1, 1, 1, 1, 1, 1). 

Now suppose that 4a® = 27b^. Then a = 3e^,b = 2e® where e = 3b/2a G k*, 
and sgn((5® ) = sgn((5i) = — 27e®. Let s be a square root of — 3sgn(D) in some 
suitable extension of k. Then by Lemma 4.1, y/—3D £ k{s){x~^), so i5® ,<5® G 
fc(s)(x“®). Again by Lemma 4.1, <5+,i5_ G k{s){x~^). Write (5+ = /? + 7s with 
6, 7 G k{x~^). Then we reason completely analogous to the case 3m > 2n, m 
even, that K/k{x) has signature (1, 1, 1, 1, 1, 1) if sgn(D) is a square in k and 
(1, 1, 1, 2) otherwise. 

Assume now that deg(D) is odd. Then y/—3D £ k{x~^^‘^) \ k{x~^), so 
(5® G k{x~^/'^) \ k{x~^), and hence <5+,<5- ^ k{x~^). If follows that at least 
one of the roots does not lie in k{x~^), so the signature is (1, 1, 2, 1) or (3, 1). But 
for signature (3, 1), we have yi £ /c(a:“®/®) for f = 0, 1, 2, so <5+ G k(yo, yi,y 2 ,u) = 
fc(u)(a;“®/®), and hence (5® G k{u){x~^/^) fl = k{x~^), which is a con- 
tradiction. So Kjk{x') must have signature (1, 1, 2, 1) in this case. □ 

We point out that Lee [7] provided an elegant proof of the above theorem 
that uses the Hilbert class field of k{x,'/A), but it is restricted to square- 
free A. One can also apply the transformation x — >■ x~^ and investigate the 
polynomial cc®*F"(Y, (mod x), but this will be inconclusive in certain cases 
(when (0, 0) is a singular point of the resulting curve, i.e. deg(A) is odd and 
deg(H) = 1 (mod 3)). 

Using the signature description for purely cubic function fields given in The- 
orem 2.1 of [12] and the well-known characterization of hyperelliptic function 
fields (see for example Proposition 14.6 on p. 248 of [9]), we can reformulate and 
summarize Theorem 4.2 as follows. 
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Corollary 4.3. Let C : f{x,y) = 0 with f(Y) = — AY + B € be 

a standard model of a cubic extension K/k{x) where k = ¥q is a finite field of 
characteristic at least 5. Set D = — 27 Then the following holds: 

• If \D\ ^ |i?p — this is exactly the case if either |A|^ > |Sp or |Ap = |_Bp 
and 4 sgn(A)^ = 27sgn(_B)^ — then the signature ofK/k{x) is (1, IjS”) where 
S is the signature of the hyperelliptic extension k{x){'/D)/k{x). 

• If \D\ = l-Bp, then there are two cases: 

o If |A|^ < |i?P, then the signature of K/k{x) is equal to the signature of 
the purely cubic extension k{x){ffD) /k{x). 
o If |Ap = |_Bp and 4sgn(A)^ ^ 27sgn(i?)^, then K/k{x) is unramified 
(i.e. all the Cj in the signature of K/k{x) are equal to 1), and the fi in 
the signature are the degrees ( with respect to the indeterminate t) of the 
irreducible factors of the equation t^ — sgn(A) t + sgn(_B) = 0 over k. 



5 Unit Group and Regulator 



Let K = k{x, y) be a cubic function field of characteristic different from 3 in 
standard form with minimal polynomial f{Y) = Y^ — AY + B G A:[a;][F]. As 
before, denote by y = yo,y' = yi,y" = 2/2 the roots of f{Y) (given by (4.2) if k 
has odd characteristic). For any 6 = a + by + cy"^ £ K, write 0 ^) = q, _|_ 5^. _|_ ^^2 
for the z-th conjugate of d (0 < i < 2). The unit group of K/k{x) is the group of 
units O* of the maximal order O of K. By Dirichlet’s Unit Theorem, O* is an 
infinite Abelian group whose torsion part is k* and whose torsion-free part has 
rank s — 1 where s is the number of places at infinity in K/k{x). The quantity 
r = s — 1 is called the unit rank of K/k{x). The following table outlines the 
possible unit rank scenarios for cubic function fields. 



Signature 


Unit Rank 


(1,3) or (3,1) 


0 


(1,1, 1,2) or (1, 1,2,1) 


1 


(1,1, 1,1, 1,1) 


2 



A set of generators {ei, C2 , . . . , e^} of 0*//c* is a system of fundamental units. 
Let {p;^,p2, ■ • ■ Psl he the set of divisors in K lying above the place at infinity 
in k{x) as described in (4.1). For 1 < z < s, let fi denote the residue degree of 
Pi and Vi the additive valuation associated with p^. Consider the r x s integer 
matrix 

(-fivi{ei) -/2Z22(ei) ... -fsVs{ei)\ 

-/ii^i(e2) -/2«^2(e2) -fsKs{(^2) 



M = 



\-fl>^l{er) -f2l^2{er) ■■■ -fsl^s{er)J 



Rosen [9, p. 245] defines the regulator ^ to be the absolute value of the de- 
terminant of any of the r x r minors obtained by deleting the j-th column from 
A7 (1 < J < s); it is easy to show that this definition is independent of the mi- 
nor and the set of fundamental units chosen. While this definition is consistent 
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with the definition of the regulator for an algebraic number field, Schmidt [13] 
presents a slightly different definition. 

Denote by S the group generated by pi,p 2 , • ■ • jPs, and let 5o = {O G 5 | 
/i) = 0} where /j, is the residue degree of 0. If P denotes the group of principal 
divisors, then So (IP is generated by the divisors (ci) of the fundamental units Ci 
for 1 < i < s. According to Schmidt, the regulator of K/k{x) is the group index 
R= [5o : P n 5oj. By Lemma 4.13 of [9], the two regulators are related via the 
identity 



dC*?) flf^ /s ^ 

^ gcd(/i,/ 2 ,...,/^) 



(5.1) 



Furthermore, if h is the class number, i.e. the order of the Jacobian, of K/k, 
and h' the ideal class number of K/k{x), then by Theorem 25 of [13], 



h = 



R 

gcd(/i,/ 2 ,...,/^) 



h' = 



R 



(<?) 






h'. 



(5.2) 



Specifically, for cubic function fields: 



Theorem 5.1. LetC : y^ — Ay + B = 0 be a standard model of a cubic extension 
K/k{x). Let h be the order of the Jacobian of K/k, and let h' , R^g\ and R denote 
the ideal class number, the regulator a la Rosen, and the regulator a la Schmidt, 
of K/k{x), respectively. Let {ei, £ 2 , . . . , £r} be a system of fundamental units of 
K/k{x). Lf K/k{x) has signature 

• (1,1, 1,1, 1,1), then R = R^^^ = 

{0, 1, 2} with i yf j, and h = Rh' ; 

• (1,1, 1,2), then R = R^g^ /2 = \deg{ei)\/2, and h = Rh' ; 

• (1,3) then R= R^g^ /3 = 1, and h = hf jZ; 

• (1,1, 2,1), then R = R^g"’ = \deg{ei)\, and h = Rh' ; 

• (3, 1), then R = ^ = 1, and h = h' . 

Proof. The relationships between R^g^ and R as well as h and h' follow from (5.1) 
and (5.2), respectively. For the rest, we can reason as in the proof of Theorem 
2.1 of [12]: in the cases where there is only one (inert or totally ramified) place 
at infinity in K, Sq is trivial and hence R = 1. For signature (1, 1, 1, 1, 1, 1), 
the formula for R follows from the fact that the map that permutes the three 
roots of f{Y) also permutes the three places at infinity. Finally, if there are 
two places at infinity in K, of respective degrees /i = 1 and /2 = 1 or 2, then 
R = 1j^i(£i) 1//2- Since /i = 1, the completion of K with respect to is equal 
to so ]i^i(£i)l = ]deg(£i)l. n 



deg(£^*0 deg(£^*^) 
deg(£^-^^) deg(£^-^^) 



(O'! 



det 



where i,j G 



If K/k{x) has signature (3, 1), i.e. the place at infinity in k{x) is totally ramified 
in K, then the Jacobian of K/k is in fact isomorphic to the ideal class group 
of K/k(x). 
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6 Fundamental Units 

In order to compute the regulator and/or fundamental unit(s) of a cubic function 
field of nonzero unit rank, we require the notion of ideal reduction. Once again, 
we let K{x,y) be a cubic function field of characteristic at least 5 in standard 
form with minimal polynomial f(Y) = — AY + i? G k[x, Y] and unit rank 

r > 0. Possible signatures for K/k{x) are (1, 1, 1, 1, 1, 1) if r = 2 and (1, 1, 1, 2) or 
(1, 1, 2, 1) if r = 1. Let the roots of / be j/o = Vi Vi = v', V 2 = y" ■ We henceforth 
write the embedding(s) of K into k{x~^) multiplicatively. If r = 2, then there 
are three such embeddings given by three valuations | • |i (0 < i < 2). We write 
I • lo = I • I and number the valuations so that \y\i = \yi\ = so |g)|- = 

for all 0 G AT and 0<t<2. Ifr=l, then there is just one embedding of K into 
k{x~^) which we write as | • |. To unify the notation for both unit rank scenarios, 
we set |0|o = |6^|, \0\i = \9\2 = for all 0 € K in the unit rank 1 case. 

The regulator and fundamental unit(s) of K/k{x) can be computed exactly 
as described in [12] for the unit rank 1 case and [6] for the case of unit rank 2, 
so we only give the minimal necessary background here and recall the algorithm 
for completeness. We focus our discussion on fractional ideals of O, i.e. subsets 
f of AT such that df is an ideal in O for some nonzero d G k[x]. In our context, 
fractional ideals are always nonzero — so they are A: [x] -submodules of AT of rank 
3 — and contain 1 . A fractional ideal f is reduced if for any 0 G f, the inequalities 
\9\i < 1 for z = 0, 1, 2 imply 9 G k. Note that O is reduced. 

Let {l,p, w} be a canonical basis of Kjk{x). For a = a + bp + auGK with 
a,b,cG k{t), we let^ 

(a = ot' + a" = Tr{a) — a = (2a — 3bT + cE) — bp — cw, 

= a-^Tr(a) = ^(2a-Ca) = (bT - ^Ec) + bp + cuj, (6.1) 

Pa = a' - a" = (y' - y") (b - jp^ , 

where E, T, and I (the index of y) are as in Corollary 3.2. Note that Co,, ^a, 

Pa/(y' - y") G at. 

Let {l,p,,v} be a fc[x]-basis of some non-zero reduced fractional ideal f of O. 
Then it is easy to verify that 

det ^ = s Li(f) (6.2) 

for some s G k* . For z G {0, 1, 2}, the basis {1, p, ir} is said to be i-reduced if 

lUI* > K\i < 1 < KU, IC/xli < 1. ICI* < 1- (6-3) 

^ The definitions of Ca,fa,ria can be modified in snch a way that the reduction algo- 
rithm given below also works in fields of even characteristic, other than F 2 . Since we 
excluded the characteristic 2 case up to now, we omit the details here. 
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Such a basis always exists and is unique up to nonzero constant factors (Theorem 
4.4 in [ 6 ]). Since |Ca«I* < we have < 1, so |/x| > 1 since f is 

reduced and ^ ^ k. li follows that |^^|i = |^|i > 1 and hence from ( 6 . 2 ) and 
(6.3), |Z\(f)| > 1 and \v\i < |/x|i < |Z\(f)|. Furthermore, = \v''\i > 1. 

The following theorem gives the connection between reduced bases and fun- 
damental units. For purely cubic extensions, the relevant discussion of the unit 
rank 1 case can be found on p. 1255 of [12]; see also Theorem 3.7 in [ 6 ] for unit 
rank 2. The result was proved for arbitrary number fields of unit rank 1 and 2 
in [3], and the proofs in that source carry over completely to the function field 
setting. 

Theorem 6.1. 

1. Suppose r = 1 and set fg = O, = (Mn)~^f„ for n > 1 where {1, Vn\ is 
a 0-reduced basis of the reduced fractional ideal f„ . Let ? G N &e the minimal 
index such that f; = fg. Then 



i-i 



e = 

i=0 



is a fundamental unit of K/k{x). 

2. Suppose r = 2 and set fg = O, fn+i = (c«n ^)fn for n > 0 where 

_ / Mn if \^n\l > 1, 

- Sgn(l/(,) if \Vr,\l = 1, 

and is a 0-reduced basis of the reduced fractional ideal f„. Let 

p G Z-° and I G N be minimal such that fp_|_; = fp and set 



p+i-i 

ei == Oii. 

i—p 

Now set flg = fp, = {(3~^)Qn forn>l where 

fj ^ f if |r„|o > 1, 

\ r„ - sgn(r„) if |r„|g = 1. 

and {l,cr„,r„} is a 2-reduced basis of the reduced fractional ideal g„. Let 
m,h G Z-° be minimal such that = fp^/j and set 

m — 1 j p+h—1 

£2 = n n 

1=0 \ i=p 

Then {£ 1 , 62 } is a pair of fundamental units of K/k{x). 
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To find the regulator instead of the fundamental unit(s), we can avoid evalu- 
ating the (computationally expensive) products given above. Instead, we simply 
sum over the degrees of the fit (for r = 1), respectively, the o;i,a',/3j, and /?' 
(for r = 2). Note that ideal equality as required in the above theorem can be 
tested by comparing appropriately normalized 0-reduced bases. 

Theorem 6.1 implies that in order to determine the fundamental unit(s) of 
K/k{x), we require a way to compute for i € {0,1,2} an i-reduced basis of a 
reduced fractional ideal f, where f is given in terms of a fc[x]-basis of the form 
{1, /2,k} with 



= {P,^} or 

{P, i'} = or (6.4) 

l/i, v} = , ^9~^} where 9 = v — sgn(i/^*+^)), 



where the last case only occurs for unit rank 2 and z = 0 or i = 2 (in the latter 
case, z-l-1 is taken to be 0). Then the desired reduced bases can be computed using 
the following algorithm (see also Algorithm 7.1 of [12] with the simplification of 
Algorithm 6.3 in [10] for r = 1, and Algorithm 4.6 of [6] for r = 2): 

Algorithm 6.2. 

Input: (z,/z, zz) where i G {0,1,2} and fi,v are given by (6.4)- 
Output: (/i, zz) where {l,pL,v} is an i-reduced basis o/f. 

Algorithm: 

1. Set pi = jl, V = V. 



2- If ^ or if rind [zy^lz ^ 



replace 



by 



0 1 
-10) \iy 



3- If \Vfj.\i > \v,,U then 
3.1. (r = 2 only.) While \ivriv\i > |2\(f)|^/^, 



replace 
3.2. Replace 



by 

by 



0 



1 



-1 LCu(*)/C(*)J j \'^ j ' 



0 



1 



p 



3-3. If = \vi,\i, 



-1 LCai(c/L(oJ J J ' 

1 —a' 



replace by ”)(^) ^ ^ 



4 . While |? 7 ^|* > 1, replace by ^1^ ^P^ _ 

5. Replace p. by pi — [C^(t)J/2 and v by v — [^j^(i)J/2. 

6. Return (pi, v) . 

m m 

Here, for a = ^ OjX* G k{x~^), the expression [a] = ^ Oja;* denotes the 

— 00 2 = 0 

polynomial part of a. 
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7 Approximations 



In order to compute expressions such as or even just = |? 7 ^(i)| 

in Algorithm 6.2, it is necessary to have a sufficiently good approximation for 
the basis elements p,oj (and their conjugates in the unit rank 2 case). These 
elements lie in k{x~^) and thus have an infinite expansion in x~^ , of which 
we can only carry finitely many terms. We in turn require sufficiently good 
approximations for the root(s) j/o (and in the unit rank 2 case, y\ and j /2 as well) 
of f{Y) = — AY + B. In purely cubic fields, this can be accomplished by 

explicitly extracting a cube root of —B to a sufficient precision. However, if the 
extension is not purely cubic, i.e. A yf 0, we need to proceed differently; in fact, 
we essentially use Newton’s method. 

Analogous to [10] and [6], we define for a nonzero element a € k{x~^) of 
degree m a relative approximation of precision n € Z-° to a to be a truncated 
Laurent series a with |1 — d/aj < g“”. If a = X^t-oo then we can set 
^ purely cubic fields, the analysis in [10] revealed that 

precision n = deg(A)/2 for relative approximations to the basis elements p and 
w was sufficient to guarantee that the reduction algorithm (with p and ui replaced 
by their respective approximations) produces correct results. We suspect that 
the same is true for arbitrary cubic fields, but a more careful investigation of 
this question is warranted and is the subject of future research. 

Theorem 7.1. Let a G k{x~^) be any root off(Y) = Y^ — AY+B {A, B G k[x], 
AB yf 0). Set I = max{0, — deg(Aa“^ — 3)} G Z-°, and let oq be a relative 
approximation of precision I to a. For j G N, define 



Then 



1 - 



(2a). 


1 


- A J 


a 




aj 





with Vj = max{0, 2^ — 1 + I — deg(a)} G Z- . 



Proof. The claim holds for j = 0. For j > 0, we have 

\a — Oj+i 1 = 



2a) -B 



2a) -B 



3a) -A 



3a) — A 



(2a) - B)x^^+^ 
3a) — A 



The expression in the second set of parentheses has absolute value at most 

For the term in the first set of parentheses, write 



a — 



2a) - B 
3a) — A 



— (a - aj) + aj - 



2ai - B 
^ ~ 3a) - A 



= a — aj 



a) — Aaj + B 



3a) — A 



— OL — CXj -\- 



— A.OI -\- B^ 



3a) - A 



= (a-aj) 1 - 



a) + aaj + a"^ — A 



3a) - A 



= -(a-aj) 



Oi “h ^OLj 
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Now \3aj—A\ = |3(a^— a^) + 3o;^ — A|. By induction hypothesis and assumption, 
\aj — a^\ < q~’’\a\^ < |3a^ — A| , so since \a + 2aj\ = |a|, again by induction 
hypothesis, 



2«3 _ B 
3a] - A 



< {q 



-( 2 ^+ 0 ) 2 . |2 



|3a2 _ A\ 






□ 



Lemma 7.2. Let f{Y) = — AY + B (A, B G k[x],B ^ 0) and let yo, yi,y 2 

be the zeros of f{Y) with yo G k{x~^) (yi,y 2 G k{x~^) if K has unit rank 2). 

• If\A\^ > \B\^, then |j/o| = l-B|/l^l; sgn(yo) = sgn(B)/sgn(A), \yi\ = jyal = 
|A|i /2 sgn(j/i) = -sgn(j/ 2 ) = sgn(A)i/ 2 . 

• If |Ap < |i?|2, then \yi\ = \B\^^^ and sgn(yi) = — M®sgn(_B)^/^ for t = 0, 1, 2, 
where u is a primitive cube root of unity. 

• If |A|^ = |_B|2, then \yi\ = |A |^/2 ^/jg yalues o/ sgn(j/i) are the roots 

of the equation t^ — sgn(A)t + sgn(S) = 0 for i = 0, 1,2. If 4sgn(A)^ 
27sgn(B)2, then these roots are distinct, otherwise, the roots are — 2 c, c, c 
where c = 3sgn(i?)/2sgn(A), so sgn(A) = 3c^ and sgn(B) = 2 c^. 

Lemma 7.2 shows that the quantity I in Theorem 7.1 is almost always zero, 
in which case we can determine «o = sgn(o;)a;‘^®®*'“^ from the lemma. In order 
to obtain a desired precision n for our root approximation, we then simply com- 
pute ao,ai, . . . , am where m = [log 2 (n + 1)] . The only problematic case which 
requires a better initial approximation ao to a happens when |Ap = |i ?|2 and 
4sgn(A)^ = 27sgn(B)2. The smaller |Aa “2 _ 3 | ^ 3 ^ closer our situation re- 
sembles a repeated root scenario (as expected), with two roots j/ 1 , 2/2 of f{Y) 
lying close together (and close to one of the square roots of A/3 as well as one 
of the cube roots of B 12). Then 4A^ « 27 B“^, i.e. |I?| is small as well (note that 
\D\ = |A|2|yi — j/ 2 p in this case). 

Note that in order to determine in step 4 of Algorithm 6.2, we need to 
compute \y' — y"\i by (6.1). If \y' — y''\i > \y'\i, this can be done using Lemma 
7.2; otherwise, we have \y' — y''\i = \A\\{y — y'){y — y'')\~'^ , and the denominator 
can again be computed using Lemma 7.2. 

We will present an implementation of the ideas presented here as well as 
numerical results in a future paper. 
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Abstract. The binary algorithm is a variant of the Euclidean algorithm 
that performs well in practice. We present a quasi-linear time recursive 
algorithm that computes the greatest common divisor of two integers 
by simulating a slightly modified version of the binary algorithm. The 
structure of our algorithm is very close to the one of the well-known 
Knuth-Schonhage fast gcd algorithm; although it does not improve on 
its 0{M (n) log n) complexity, the description and the proof of correctness 
are significantly simpler in our case. This leads to a simplification of the 
implementation and to better running times. 



1 Introduction 

Gcd computation is a central task in computer algebra, in particular when com- 
puting over rational numbers or over modular integers. The well-known Eu- 
clidean algorithm solves this problem in time quadratic in the size n of the inputs. 
This algorithm has been extensively studied and analyzed over the past decades. 
We refer to the very complete average complexity analysis of Vallee for a large 
family of gcd algorithms, see [10]. The first quasi-linear algorithm for the integer 
gcd was proposed by Knuth in 1970, see [4]: he showed how to calculate the gcd of 
two n-bit integers in time 0(n log^ n log log n). The complexity of this algorithm 
was improved by Schonhage [6] to 0(nlog^ n log log n). A comprehensive descrip- 
tion of the Knuth-Schonhage algorithm can be found in [12]. The correctness of 
this algorithm is quite hard to establish, essentially because of the technical de- 
tails around the so-called “fix-up procedure” , and a formal proof is by far out of 
reach. As an example, several mistakes can be noticed in the proof of [12] and can 
be found at http://www.cs.nyu.edu/cs/faculty/yap/book/errata.html. 
This “fix-up procedure” is a tedious case analysis and is quite difficult to imple- 
ment. This usually makes the implementations of this algorithm uninteresting 
except for very large numbers (of length significantly higher than 10® bits). 

In this paper, we present a variant of the Knuth-Schonhage algorithm that 
does not have the “fix-up procedure” drawback. To achieve this, we introduce 
a new division (GB for Generalized Binary), which can be seen as a natural 
generalization of the binary division and which has some natural meaning over 
the 2-adic integers. It does not seem possible to use the binary division itself in 
a Knuth-Schonhage-like algorithm, because its definition is asymmetric: it elim- 
inates least significant bits but it also considers most significant bits to perform 

D.A. Buell (Ed.): ANTS 2004, LNCS 3076, pp. 411-425, 2004. 
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comparisons. There is no such asymmetry in the GB division. The recursive GB 
Euclidean algorithm is much simpler to describe and to prove than the Knuth- 
Schonhage algorithm, while admitting the same asymptotic complexity. 

This simplification of the description turns out to be an important advantage 
in practice: we implemented the algorithm in GNU MP, and it ran between 
three and four times faster than the implementations of the Knuth-Schonhage 
algorithm in Magma and Mathematica. 

The rest of the paper is organized as follows. In §2 we introduce the GB di- 
vision and give some of its basic properties. In §3, we describe precisely the new 
recursive algorithm. We prove its correctness in §4 and analyze its complexity 
in §5. Some implementation issues are discussed in §6. 

Notations: The standard notation 0(.) is used. The complexity is measured in 
elementary operations on bits. Unless specified explicitly, all the logarithms are 
taken in base 2. If a is a non-zero integer, ^{a) denotes the length of the binary 
representation of a, i.e. ^{a) = [log |a|J -I- 1; V 2 {a) denotes the 2-adic valuation 
of a, i.e. the number of consecutive zeroes in the least significant bits of the 
binary representation of a; by definition, 1 ^ 2 ( 0 ) = 00 . r := a cmod b denotes the 
centered remainder of a modulo b, i.e. a = r mod b and — | < r < |. We recall 
that M{n) = 6>(nlognloglogn) is the asymptotic time required to multiply two 
n-bit integers with Schonhage-Strassen multiplication [7]. We assume the reader 
is familiar with basic arithmetic operations such as fast multiplication and fast 
division based on Newton’s iteration. We refer to [11] for a complete description 
of these algorithms. 

2 The Generalized Binary Division 

In this section we first recall the binary algorithm. Then we define the generalized 
binary division — GB division for short — and give some basic properties about 
it. Subsection 2.4 explains how to compute modular inverses from the output of 
the Euclidean algorithm based on the GB division. 

2.1 The Binary Euclidean Algorithm 

The binary division is based on the following properties: gcd(2o, 2b) = 2 gcd(a, 6), 
gcd(2a -I- 1, 26) = gcd(2a -I- 1, 6), and gcd(2a -I- 1, 26 -I- 1) = gcd(26 -|- 1, a — 6). It 
consists of eliminating the least significant bit at each loop iteration. Fig. 1 is a 
description of the binary algorithm. The behavior of this algorithm is very well 
understood (see [1] and the references there). Although it is still quadratic in the 
size of the inputs, there is a significant gain over the usual Euclidean algorithm, 
in particular because there is no need to compute any quotient. 

2.2 The Generalized Binary Division 

In the case of the standard Euclidean division of a by 6 with jaj > |6|, one 
computes a quotient q such that when qb is added to a, the obtained remainder 




A Binary Recursive Gcd Algorithm 413 



Algorithm Binary-Gcd. 

Input: a,b G Z. 

Output: gcd(a, fe). 

1. If l&l > |a|, return Binary-Gcd(6, a). 

2. If & = 0, return a. 

3. If a and b are both even then return 2 • Binary-Gcd(a/2, 6/2). 

4. If a is even and b is odd then return Binary-Gcd(a/2, b). 

5. If a is odd and b is even then return Binary-Gcd(a, 6/2). 

6. Otherwise return Binary-Gcd((|a| — |6|)/2,6). 



Fig. 1. The binary Euclidean algorithm. 



is smaller than b. Roughly speaking, left shifts of b are subtracted from a as 
long as possible, that is to say until a has lost its £(a) — £{b) most significant 
bits (approximately). The GB division is the dual: in order to GB-divide a by 6 
with |a |2 > \b\2, where |a |2 := is the 2-adic norm of a, one computes a 

quotient ^ such that when ^6 is added to a, the obtained remainder is smaller 
than b for the 2-adic norm. Roughly speaking, right shifts of b are subtracted 
from a as long as possible, that is to say until a has lost its V2{b) — i'2{a) least 
significant bits. 



Lemma 1 (GB Division). Let a,b be non-zero integers with ^2(0) < V2{b)- 
Then there exists a unique pair of integers {q, r) such that: 



b 

1^1 ^ 2'^2(6)-i"2(o)^ 

V2{r) > V2{b). 



( 1 ) 

(2) 

(3) 



The integers q and r are called respectively the GB quotient and the GB remain- 
der of {a, b). We define GB{a,b) as the pair {q,r). 

Proof. From (1), q = — mod 2 '^ 2 (b)-i^ 2 (a)+i^ Since q is odd, the 

second condition is fulfilled and gives the uniqueness of q. As a consequence, 
r is uniquely defined by (1). Moreover, since r = a -I- we have 

r = 0 mod which gives condition (3). 

The GB division resembles Hensel’s odd division, which was introduced by 
Hensel around 1900. A description can be found in [8]. For two integers a and b 
with b odd, it computes q and r in Z, such that: a = —bq -\- 2 ^r and r < 2 b, 
where p = £(a) — £(b) is the difference between the bit lengths of a and b. In 
other words, Hensel’s odd division computes r := (2“^’a) mod b, which may be 
found efficiently as shown in [5]. Besides, there is also some similarity with the 
PM algorithm of Brent and Kung [2]. When iz2{a) = 0 and V2{b) = 1, i.e. a is 
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odd and b = 2b' with b' odd, the GB division finds <7 = ±1 such that (a + qb') /2 
is even, which is exactly the PM algorithm. Unlike the binary, Hensel and PM 
algorithms, the GB division considers only low order bits of a and b: there is 
no need comparing a and b nor computing their bit lengths. It can be seen as a 
natural operation when considering a and b as 2-adic integers. 

We now give two algorithms to perform the GB division. The first one is the 
equivalent of the naive division algorithm, and is asymptotically slower than the 
second one, which is the equivalent of the fast division algorithm. For most of the 
input pairs (a, 6), the Euclidean algorithm based on the GB division performs 
almost all its divisions on pairs (c, d) for which 1^2 (d) — J^2(c) is small. For this 
reason the first algorithm suffices in practice. 



Algorithm Elementary-GB. 

Input: Two integers a,b satisfying U 2 (a) < V 2 (p) < 00 . 
Output: (g, r) = GB{a,b). 

1. q := 0, r ~ a. 

2. While 1 ^ 2 (r) < U 2 {b) do 

3. g g- 

4. r := r - 

5. q:=q cmod ^ + «• 

6. Return {q,r). 



Fig. 2. Algorithm Elementary-GB. 



Lemma 2. The algorithm Elementary-GB of Fig. 2 is correct and if the input 
{a,b) satisfies £{a),£{b) < n, then it finishes in time 0{n- [w 2 {b) — 122(a)]). 

It is also possible to compute GB{a, b) in quasi-linear time, in the case of 
Schonhage-Strassen multiplication, by using Hensel’s lifting (which is the p-adic 
dual of Newton’s iteration) . 

Lemma 3. The algorithm Fast-GB of Fig. 3 is correct and with Schonhage- 
Strassen multiplication, if a and b satisfy the conditions £{a),£{b) < 2n and 
V 2 {b) — V 2 {a) < n, then it finishes in time 0{M{n)). 

2.3 The GB Euclidean Algorithm 

A GB Euclidean algorithm can be derived very naturally from the definition of 
the GB division, see Fig. 4. 

Lemma 4. The GB Euclidean algorithm of Fig. 4 is correct, and if we use 
the algorithm Elementary-GB of Fig. 2, then for any input (a,b) satisfying 
£{a),£{b) < n, it finishes in time O(n^). 
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Algorithm Fast-GB. 

Input: Two integers a,h satisfying V 2 {a) < V 2 ip) < oo. 
Output: (g, r) = GB{a,b). 

1- ^ := B := n U 2 {b) - V 2 {a) + 1. 

2. q ■- 1. 

3. For i from 1 to [logn] do 

4. q q + q{l — Bq) mod 

5. q := Aq cmod 2". 

6. r ~ a + i^^b. 

7. Return {q,r). 



Fig. 3. Algorithm Fast-GB. 



Algorithm GB-gcd. 

Input: Two integers a,b satisfying V 2 {a) < V 2 {b). 

Output: The odd part of the greatest common divisor g of a and b. 

1. If 6 = 0, return 

2. {q,r) ~GB{a,b). 

3. Return GB-gcd (6, r). 



Fig. 4. The GB Euclidean algorithm. 



Proof. Let tq = a, ri = 6, r 2 , . . . be the sequence of remainders that appear in 
the execution of the algorithm. We first show that this sequence is finite and 
thus that the algorithm terminates. 

For any fc > 0, Eqs. (1) and (2) give |rfc+ 2 | < \rk+i \ + \rk\, so that \rk\ < 
2”+i . Moreover, 2'^ divides \rk\, which gives 2^= < \vk\ < 2”+^ 

and (1 — log < n-l- 1. Therefore there are 0{n) remainders in the remain- 

der sequence. Let t = 0{n) be the length of the remainder sequence. Suppose 
that rt is the last non-zero remainder. From Lemma 2, we know that each 
of the calls to a GB-division involves a number of bit operations bounded by 
0(log \rk\ ■ [i' 2 {rk+i) - V 2 (rk)]) = 0(n ■ [v 2 (rk+i) - V 2 (rk)]), so that the overall 
complexity is bounded by 0{n ■ V 2 {ft))- 

For the correctness, remark that the GB remainder r from a and b satisfies 
r = a mod b' where b' = is the odd part of b, thus gcd(a, b') = gcd(r, b'). 

Remark. For a practical implementation, one should remove factors of two in 
Algorithm GB-gcd. If one replaces the return value in Step 1 by a, and in Step 3 
by 2‘"Aa)GB-gcd(^^, the algorithm directly computes g = gcd(a,6). 
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A better bound on \rk\ and thus on t is proved in §5.1. Nonetheless the present 
bound is sufficient to guarantee the quasi-linear time complexity of the recursive 
algorithm. The improved bound only decreases the multiplying constant of the 
asymptotic complexity. 

For n > 1 and q G [— 2" + 1, 2" — 1] we define the matrix [g]„ = 

Let rg, ri be two non-zero integers with 0 = V 2 {ro) < r' 2 (?'i), and ro,ri,r 2 , . . . 
be their GB remainder sequence, and qi,q 2 ,... be the corresponding quotients: 
= Ti-i -I- qi i > 1. Then the following relation holds 

for any z > 1: 

(r,+i) (r°)’ 

where nj = V 2 {rj) — i’ 2 {'<’j-i) ^ 1 for any j > 1. 

In what follows, we use implicitly the following simple fact several times. 

Lemma 5. Let rg, r\ he two non-zero integers with 0 = V 2 {ro) < z^ 2 (''’i), and 
f’ei,ri,r 2 , ■ ■ ■ he their GB remainder sequence. Let d > 0. Then there exists a 
unique z > 0 such that V 2 {ri) < d < iz 2 {ri+i). 




2.4 Computing Modular Inverses 

This subsection is independent of the remainder of the paper but is justified 
by the fact that computing modular inverses is a standard application of the 
Euclidean algorithm. 

Let a, & be two non-zero integers with 0 = zz2(a) < zz2(6) and £{a),£{h) < n. 
Suppose that we want to compute the inverse of h modulo a, by using an extended 
version of the Euclidean algorithm based on the GB division. The execution of 
the extended GB Euclidean algorithm gives two integers A and B such that 
Aa-\- Bh = 2“g, where a = 0(n) and g = gcd(a, b). From such a relation, it is 
easy to check that 5=1. Suppose now that the inverse B' of b modulo a does 
exist. From the relation Aa Bb = 2“, we know that: 

Q 

B' = — mod a. 

2 “ 

Therefore, in order to obtain B' , it is sufficient to compute the inverse of 2“ 
modulo a. By using Hensel’s lifting (like in the algorithm Fast-GB of Fig. 3), 
we obtain the inverse of a modulo 2“. This gives x and y satisfying: xa-\-y2°‘ = 1. 
Glearly y is the inverse of 2“ modulo a. 

Since multiplication, Hensel’s lifting and division on numbers of size 0(ri) 
can be performed in time 0{M{n)) (see [11]), given two zz-bit integers a and 6, 
the additional cost to compute the inverse of b modulo a given the output of an 
extended Euclidean algorithm based on the GB division is 0{M{n)). 
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3 The Recursive Algorithm 

We now describe the recursive algorithm based on the GB division. This de- 
scription closely resembles the one of [12]. It uses two routines: the algorithm 
Half-GB-gcd and the algorithm Fast-GB-gcd. Given two non-zero integers 
ro and ri with 0 = t^ 2 {ro) < the algorithm Half-GB-gcd outputs 

the GB remainders and of the GB remainder sequence of (ro,ri) that 
satisfy i' 2 {ri) < £{ro)j2 < iy 2 {ri+i)- It also outputs the corresponding ma- 
trix [( 7 i]„^ . . . [qi]m - Then we describe the algorithm Fast-GB-gcd, 

which, given two integers a and 6, outputs the gcd of a and b by making successive 
calls to the algorithm Half-GB-gcd. 

The algorithm Half-GB-gcd works as follows: a quarter of the least signifi- 
cant bits of a and b are eliminated by doing a recursive call on the low f (a) /2 of 
the bits of a and b. The crucial point is that the GB quotients computed for the 
truncated numbers are exactly the same as the first GB quotients of a and b. 
Therefore, by multiplying a and b by the matrix obtained recursively one gets 
two remainders {a',b') of the GB remainder sequence of (a, 6). A single step of 
the GB Euclidean algorithm is performed on (a' , b'), which gives a new remain- 
der pair (b',r). Then there is a second recursive call on approximately £{a)/2 of 
the least significant bits of (6', r). The size of the inputs of this second recursive 
call is similar to the one of the first recursive call. Finally, the corresponding 
remainders (c, d) of the GB remainder sequence of (a, b) are computed using the 
returned matrix i? 2 , and the output matrix R is calculated from i?i, i ?2 and the 
GB quotient of {a',b'). Fig. 5 illustrates the execution of this algorithm. 

Note that in the description of the algorithm Half-GB-gcd in Fig. 6, a 
routine GB’ is used. This is a simple modification of the GB division: given a 
and b as input with 0 = JZ 2 (a) < V 2 {b), it outputs their GB quotient q, and 
if r is their GB remainder. The algorithm Fast-GB-gcd uses several times the 
algorithm Half-GB-gcd to decrease the lengths of the remainders quickly. 

The main advantage over the other quasi-linear time algorithms for the in- 
teger gcd is that if a matrix R is returned by a recursive call of the algorithm 
Half-GB-gcd, then it contains only “correct quotients” . There is no need to go 
back in the GB remainder sequence in order to make the quotients correct, and 
thus no need to store the sequence of quotients. The underlying reason is that 
the remainders are shortened by the least significant bits, and since the carries 
go in the direction of the most significant bits, these two phenomena do not 
interfere. For that reason, the algorithm is as simple as the Knuth-Schonhage 
algorithm in the case of polynomials. 

4 Correctness of the Recursive Algorithm 

In this section, we show that the algorithm Fast-GB-gcd of Fig. 7 is correct. 
We first give some results about the GB division, and then we show the correct- 
ness of the algorithm Half-GB-gcd which clearly implies the correctness of the 
algorithm Fast-GB-gcd. 
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Fig. 5. The recursive structure of the algorithm Half-GB-gcd. 



4.1 Some Properties of the GB Division 

The properties described below are very similar to the ones of the standard 
Euclidean division that make the Knuth-Schdnhage algorithm possible. The first 
result states that the — 1^2(0) + 1 last non-zero bits of a and of b suffice to 
compute the GB quotient of two integers a and b. 

Lemma 6. Let a, b, o' and b' be such that a' = a mod 2* and b' = b mod 2* 
with I > 2 v 2 {b) + 1. Assume that 0 = i'2{a) < V2{b)- Let (q,r) = GB{a,b) and 
{q',r') = GB{a',b'). Then q = q' and r = r' mod 

Proof. By definition, q := ^ cmod Therefore, since I > 

2v2{b) + 1 , mod we have o = o' mod and q = q' . 

Moreover, r = a+q^^;^^ and r' = Consequently r = r' mod 
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Algorithm Half-GB-gcd. 

Input: a, b satisfying 0 = i' 2 {a) < V 2 {b). 

Output: An integer j, an integer matrix R and two integers c and d with 

0 = i^ 2 (c) < i^ 2 {d), such that R ' = S-’c, d* — 2^d 

are the two consecutive remainders of the GB remainder sequence of (a, b) 
that satisfy v' 2 {c*) < i{a)l2 < U 2 {d*). 



1. k~ie{a)/2\. 

2. If V 2 {b) > k, then return 0, 



1 0 
0 1 



,a,b. 



3 . fci:=[fc/ 2 j. 

4 . a := ai 22 ''i+i +ao, & 6 i 2 ^''i+i + 6o with 0 < no, &o < 

5 . ji, Ri, Cl, di Half-GB-gcd(ao, &o)- 



6 . 



^Ri 



jo ■- V2{b'). 



7 . If jo + ji > k, then return ji, Ri,a' , 6 '. 

8 . iq,r)--GB'{a',b'). 

9 . k2~k-{jo+ji). 

10 . ^ := 6i22''2+i + b'o, r := n 2 ^'‘^+^ + ro with 0 < b'o, ro < 

11. j 2 , R 2 ,C 2 ,d 2 ■— Half-GB-gcd(&o,ro). 



12 . 



92/C2+I— 2 j2 



R2- 



+ 



13. Return ji + jo + j 2 , R 2 ■ [gljo ' Ri, c, d. 



Fig. 6. Algorithm Half-GB-gcd. 



This result can be seen as a continuity statement: two pairs of 2-adic integers 
(a, b) and (o', b') which are sufficiently close for the 2-adic norm (i.e. some least 
significant bits of a and a' , and some of b and b' are equal) have the same quotient 
and similar remainders (the closer the pairs, the closer the remainders). The 
second lemma extends this result to the GB continued fraction expansions: if 
(a, 6) and (o', 6') are sufficiently close, their first GB quotients are identical. We 
obtain this result by applying the first one several times. 



Lemma 7. Let a, b, a' and b' such that a' = a mod 2^^+^ and b' = b mod 2^*+^, 
with k >0. Suppose that 0 = 1 ^ 2 ( 0 ) < V 2 {b)- Let tq = a,r\ = b,r 2 , . . . be the GB 
remainder sequence of {a, b), and let q\,q 2 , ■ ■ ■ he the corresponding GB quotients: 
r^+i = rj_i + qjipj, with Uj = V 2 {Tj) — r'2(G-i)- ^0 = = ^^^ 2 > ■ • • 

the GB remainder sequence of (a', b'), and let q[,q' 2 , ■ ■ ■ he the corresponding GB 

quotients: rl_^_-^ = rb_-^ + = ^ 2 (c') — 
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Algorithm Fast-GB-gcd. 

Input: a,h satisfying 0 = V 2 {a) < U 2 {b). 
Output: g = gcd(a,b). 

1. j,R,a',b' ■- Half-GB-gcd(o,&). 

2. If b' — 0, return a' . 

3. {q,r) ■- GB'{a',b'). 

4. Return Fast-GB-gcd(6', r'). 



Fig. 7. Algorithm Fast-GB-gcd. 



Then, i/rj+i is the first remainder such that J^ 2 (fi+i) > k, we have qj = g' and 
Vj+i = r'j_^_i mod j < 



Proof. We prove this result by induction on j > 0. This is true for j = 0, because 
a' = a mod 2^*+^ and b' = b mod 2^^+^. Suppose now that 1 < j < i. We use 
Lemma 6 with 



2 -^ 2 r^-i) ^ 2 - 2 ^ 2 -!) ’ ^ = 2fc+l-2z22(r,-i). By 

induction, modulo ^ Since j < i, we have, 

by definition of f, 2k+l — 2 v 2 {rj_i) > 2 {i' 2 {rj) — V 2 (rj-i)) + 1, and consequently 
we can apply Lemma 6. Thus Qj = g' and r^+i = mod 



Practically, this lemma says that k bits can be gained as regard to the initial 
pair (a, b) by using only 2k bits of a and 2k bits of b. This is the advantage 
of using the GB division instead of the standard division: in the case of the 
standard Euclidean division, this lemma is only “almost true”, because some of 
the last quotients before gaining k bits can differ, and have to be repaired. 



4.2 Correctness of the Half-GB-gcd Algorithm 

To show the correctness of the algorithm Fast-GB-gcd, it suffices to show the 
correctness of the algorithm Half-GB-gcd, which is established in the following 
theorem. (Since each call to Fast-GB-gcd which does not return in Step 2 
performs at least one GB division, the termination is ensured.) 

Theorem 1. The algorithm Half-GB-gcd of Fig. 6 is correct. 



Proof. We prove the correctness of the algorithm by induction on the size of 
the inputs. If £{a) = 1, then the algorithm finishes at Step 2 because V 2 {b) > 1. 
Suppose now that k >2 and that V 2 {b) < k. 

Since 2[|j + 1 < £{a), Step 5 is a recursive call (its inputs satisfy the input 



conditions) . By induction ji, Ri, ci and d\ satisfy ^ J = 2 ^ J , and 

2-^'^ci and 2^^di are the consecutive remainders and of the GB remainder 
sequence of Tq = aq and r[ = bo that satisfy V 2 {r[^) < ki < ■ From 
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Lemma 7, we know that 2 Ri ■ 



a 

b 



are two consecutive remainders 2 ^^ a' = ri^ 



and 2^^b' = of the GB remainder sequence of rp = a and ri = b, and they 
satisfy and ri^+i = modulo From these last equalities, we 

have that = V 2 {r[^) < fci < < V 2 {‘>''i^+i) ■ Thus, if the execution 

of the algorithm stops at Step 7, the output is correct. 

Otherwise, Vi ^+2 is computed at Step 8. At Step 9 we compute k 2 = 
k — V 2 {ri-^^i). Step 7 ensures that k 2 > 0. Since z/2(rij+i) > \k/2\, we have 
^2 < [fc/2l —1. Therefore Step 11 is a recursive call (and the inputs satisfy the in- 



put conditions). By induction, j 2 , S 2 , C 2 and d ,2 satisfy: 



= 2-2^2 i?2- ( 

ro 



and 2-12 C 2 and 2 ^^d ,2 are the consecutive remainders r', and of the GB re- 



mainder sequence of {bo, Tq). Moreover, V 2 {r[^) < ^2 < t' 2 (?"i 2 +i)- From Lemma 7, 

' 2f 1 b'\ 

' are two consecutive remainders ri and r^+i 



we know that 2 ^‘^82 



2hr' 



(with t = ii -I- i 2 + 1) of the GB remainder sequence of (a, 6), that 2Ji+~4('>') “ 
2-1^C2 mod 2 ^ 2 +^ and that = 2 ^’^d ,2 mod 2^^+^. Therefore the following 

sequence of inequalities is valid: V 2 {ri) = ji + j ’2 + i^ 2 {b') < k < V 2 {ri+i). This 
ends the proof of the theorem. 



5 Analysis of the Algorithms 

In this section, we first study the GB Euclidean algorithm. In particular we 
give a worst-case bound regarding the length of the GB remainder sequence. 
Then we bound the complexity of the recursive algorithm in the case of the 
use of Schonhage-Strassen multiplication, and we give some intuition about the 
average complexity. 



5.1 Length of the GB Remainder Sequence 

In this subsection, we bound the size of the matrix [<Zi]ni • • ■ where 

the Qj's are the GB quotients of a truncated GB remainder sequence rg, . . . , r^+i 
with t' 2 (ri) < t' 2 (?’o) -I- d < i' 2 {ri+i) for some d > 0. This will make possible the 
analysis of the lengths and the number of the GB remainders. As mentioned in 
§2, this subsection is not necessary to prove the quasi-linear time complexity of 
the algorithm. 

Theorem 2 . Let d > 1. Let Tq, ri with 0 = V2{tq) < V2{‘>'i), <^nd rg, ri, . . . , rj+i 
their first GB remainders, where i is such that V2{ri) <d< i22{ri+i). We con- 
sider the matrix 21x2V.) fe]"i ■ • ■ [di]ni> where the qj ’s are the GB quotients and 
rij = V2{fj) — V2{rj-i) for any 1 < j <i- Let M be the maximum of the absolute 
values of the four entries of this matrix. Then we have: 

- Ifd=Qorl,M=l, 

- Ifd= 3 ,M< 11/8, 
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- If d= 5, M < 67/32, 

- ifd=2A ord>6,M<^ 

Moreover, all these hounds are reached, in particular, the last one is reached 
when Uj = 1 and qj = 1 for any 1 < j < i. 

The proof resembles the worst case analysis of the Gaussian algorithm in [9] . 

Proof. We introduce a partial order on the 2x2 matrices: A < B if and only 
if for any coordinate [i,j], A[i,j] < B[i,j\. First, the proof can be restricted to 
the case where the g^ ’s are non-negative integers, because we have the inequality 
I • ■ • [9i]ni I < [k*|]ni ■ • ■ [|<7i |]m • This can be easily showed by induction on i 
by using the properties: l^l • _B| < \A\ ■ \B\, and if the entries of A, A', B, B' are 
non-negative, A < A' and B < B' implies A - B < A' ■ B' . 

Consequently we are looking for the maximal elements for the partial or- 
der > in the set: {7T„i+,..+„,<d / VI < j < i, Q < qj < 2"^ - 

1 and Uj > 1}. We can restrict the analysis to the case where ni -I- . . . -I- n* = d 
and all the qj's are maximal, which gives the set: 

{n^,+,„+n,=d [2”^ - 1]„. . ■ . [2”^ - l]nj. 

Remark now that [2" — 1]„ < [1]/ for any n > 3. Therefore, it is sufficient to 
consider the case where the n^’s are in {1,2}. Moreover, for any integer j > 0, 
[3] 2 -[111 -[3 ] 2 < and we also have the inequalities [3]| < [l]f, [l]f-[3]2 < 

[3]2-[l]f<[l]^and[l]?.[3]2-[l]?<[l]t 

From these relations, we easily obtain the maximal elements: 

- For d= 1, [l]i. 

- For d=2, [1]^ and [3]2- 

- For d = 3, |l]f, [3]2 • [l]i and [l]i • [3]2- 

- For d = 4, [l]f, [3]2 • [1]?, [l]i • [3]2 • [l]i and [1]{ • [3]2- 

- For d = 5, [l]f , [3]2 • [1]?, [1]? • [3]2 • [l]i, [l]i • [3]2 • [l]f and [1]? ■ [3]2. 

- For d = 6, [1]?, [3]2 • [1]{, [1]? • [3]2 • [l]i, [l]i • [3]^ • [1]? and [l]f ■ [3]2- 

- For d = 7, [1]}, [l]i • [3]2 • [l]f and [l]f • [3]2 • [l]i. 

~ For d > 8, [l]f . 



The end of the proof is obvious once we note that 2 '*[1]/ 
where uq = 0, t6i = 1 and Ui = Ui -2 + 



f Ud-l Ud 
V Ud Ud+1 



From this result on the quotient matrices, we can easily deduce the following 
on the size of the remainders and on the length of the remainder sequence. 



Theorem 3. Let rg, ri be two non-zero integers with 0 = r' 2 (?’o) < v' 2 {ri), 
and rQ,ri, . . . ,rt+i their complete GB remainder sequence, i.e. with rt+i = 0. 
Assume that 9 < j <t. Then: 
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+ kil 



(^) 



1^2 (r^) 






1^2 (rj) 



The upper bound is reached by J = [l]i * ’ 

Moreover, if £{ro),i{ri) < n, then we have: t < n/ log( ) ■ 

Proof. (Sketch) For the inequations concerning \rj\, use the maximal element 
[l]f from the proof of Theorem 2, with d = where z = j — 1. Remark then 

that the upper bound grows far slower with 1 ^ 2 (fj) than the lower bound: this 
fact gives an upper bound on V 2 {i"t) and therefore on t. 

As a comparison, we recall that the worst case for the standard Euclidean division 
corresponds to the Fibonacci sequence, with d < + o{n). Remark 

that l/log(i^) « 1.440 and 1/ log(^^^i|=^) « 1.555. 




5.2 Complexity Bound for the Recursive Algorithm 

In what follows, H{n) and G(n) respectively denote the maximum of the number 
of bit operations performed by the algorithms Half-GB-gcd and Fast-GB-gcd, 
given as inputs two integers of lengths at most n. 

Lemma 8. Let c = | log « 0.679. The following two relations hold: 

— G{n) = H{n) + G(|"cn]) + 0{n), 

- H{n) = 2iJ(Lf J + 1) + 0(M(n)). 

Proof. The first relation is an obvious consequence of Theorem 2. We now prove 
the second relation. The costs of Steps 1, 2, 3, 4, 7, 9 and 10 are negligible. Steps 5 
and 11 are recursive calls and the cost of each one is bounded by F7([fJ + 1). 
Steps 6, 12 and 13 consist of multiplications of integers of size 0(n). Finally, 
Step 8 is a single GB division, and we proved in Lemma 3 that it can be performed 
in time 0(M(n)). 

From this result and the fact that c < 1, one easily obtains the following theorem: 

Theorem 4. The algorithm Fast-GB-gcd of Fig. 1 runs in quasi-linear time. 
More precisely, G{n) = 0{M{n)\ogn). 

The constants that one can derive from the previous proofs are rather large, 
and not very significant in practice. In fact, for randomly chosen n-bit integers, 
the quotients of the GB remainder sequence are 0(1), and therefore Step 8 
of the algorithm Half-GB-gcd has a negligible cost. Moreover, the worst-case 
analysis on the size of the coefficients of the returned matrices gives a worst-case 
bound O )) which happens to be 0(2”) in practice. With these two 

heuristics, the “practical cost” of the algorithm Half-GB-gcd satisfies the same 
recurrence than the Knuth-Schonhage algorithm: 

Ti 

H{n) « 2i7(-) -f fcM(n), 

where the constant k depends from the implementation [11]. 
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6 Implementation Issues 

We have implemented the algorithms described in this paper in GNU MP [3]. 
In this section, we first give some “tricks” that we implemented to improve the 
efficiency of the algorithm, and then we give some benchmarks. 



6.1 Some Savings 

First of all, note that some multiplications can be saved easily from the fact 
that when the algorithm Fast-GB-gcd calls the algorithm Half-GB-gcd, the 
returned matrix is not used. Therefore, for such “top-level” calls to the algorithm 
Half-GB-gcd, there is no need to compute the product i ?2 • ’ ^i- 

Note also that for interesting sizes of inputs (our implementation of the re- 
cursive algorithm is faster than usual Euclidean algorithms for several thousands 
of bits), we are in the domains of Karatsuba and Toom-Cook multiplications, 
and below the domain of FFT-based multiplication. This leads to some im- 
provements. For example, the algorithm Fast-GB-gcd should use calls to the 
algorithm Half-GB-gcd in order to gain yn bits instead of with a constant 
7 yf I that has to be optimized. 

Below a certain threshold in the size of the inputs (namely several hundreds 
of bits), a naive quadratic algorithm that has the requirements of algorithm 
Half-GB-gcd is used. Moreover, each time the algorithm has to compute a GB 
quotient, it computes several of them in order to obtain a 2 x 2 matrix with 
entries of length as close to the size of machine words as possible. This is done 
by considering only the two least significant machine words of the remainders 
(which gives a correct result, because of Lemma 6). 



6.2 Gomparison to Other Implementations of Subquadratic Gcd 
Algorithms 

We compared our implementation in GNU MP — using the ordinary integer 
interface mpz — with those of Magma V2.10-12 and Mathematica 5.0, which 
both provide a subquadratic integer gcd. This comparison was performed on 
laurent3.medicis.polytechnique.fr, an Athlon MP 2200-I-. Our implemen- 
tation wins over the quadratic gcd of GNU MP up from about 2500 words of 
32 bits, i.e. about 24000 digits. We used as test numbers both the worst case of 
the classical subquadratic gcd, i.e. consecutive Fibonacci numbers F„ and F„_i, 
and the worst case of the binary variant, i.e. G„ and 2G„_i, where Gq = 0, 
Gi = 1, G„ = — G„_i -|-4G„_2, which gives all binary quotients equal to 1. Our 
experiments show that our implementation in GNU MP of the binary recursive 
gcd is 3 to 4 times faster than the implementations of the classical recursive gcd 
in Magma or Mathematica. This ratio does not vary much with the inputs. For 
example, ratios for and G„ are quite similar. 
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type, n 


Magma V2. 10-12 Mathematica 5.0 Fast-GB-gcd (GNU MP)| 


lO" 


2.89 


2.11 


0.70 


2- 10® 


7.74 


5.46 


1.91 


5- 10® 


23.3 


17.53 


6.74 


K, 10^ 


59.1 


43.59 


17.34 


G„, 5-10® 


2.78 


2.06 


0.71 


G„, 10® 


7.99 


5.30 


1.94 



Fig. 8. Timings in seconds of the gcd routines of Magma, Mathematica and our im- 
plementation in GNU MP. 
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Abstract. Classical Gauss sums are Lagrange resolvents formed from 
the Gaussian periods lying in a cyclic extension K over Q of prime con- 
ductor. Elliptic Gauss sums and elliptic resolvents (which are particular 
instances of Lagrange resolvents) play an important role in the theory of 
abelian extensions of imaginary quadratic fields. Motivated by the close 
relationship between the Stark units and Gaussian periods in a cyclic 
extension K C Q(Cp) and the analogies between Stark units over totally 
real fields and elliptic units over imaginary quadratic fields, we consider 
for the first time Lagrange resolvents constructed from Stark units over 
totally real fields and study the differences and similarities they share 
with classical Gauss sums. 



1 Introduction 

In 1801, Gauss introduced what are now called “Gaussian periods” in the seventh 
section of his Disquisitiones Arithmeticae. Lagrange had already introduced his 
“resolvantes” in 1770 but didn’t coin this particular name for them until 1808. 
Lagrange gave an exposition of Gauss’s 7th section in 1808 and formed “classical 
Gauss sums” from Gaussian periods as a special instance of his resolvents. Gauss 
apparently preferred his periods to the corresponding resolvents except in the 
quadratic case (see [We]). During the 19th century, “elliptic Gauss sums” and 
“elliptic resolvents” were defined by replacing the Gaussian periods by division 
values of elliptic functions. Elliptic Gauss sums were introduced by Eisenstein 
to prove certain reciprocity laws (see [Le], p. 299). Fairly recently, the elliptic 
resolvents introduced by Abel and Hermite have been employed in the study 
of Galois module structure questions involving abelian extensions of imaginary 
quadratic fields (see [GT], p. 80; see also [Sc], p. 288, for another appearance of 
elliptic resolvents). 

Our goal in the present paper is to define and study an analogous resolvent 
construction over a given totally real number field F by replacing the Gaussian 
periods by special algebraic integers known as “Stark units” that are conjectured 
to lie in abelian extensions of F. This study is motivated by the relationship and 
analogies that exist between Stark units, Gaussian periods, and elliptic units. 

In the remainder of this section, we establish some conventions and basic 
properties of Lagrange resolvents used throughout the paper. In Section 2, we 
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recall some famous results concerning classical Gauss sums which serve as a 
source of comparison with the Lagrange resolvents constructed from Stark units 
in Sections 2 and 3. The relationship between the Stark units and Gaussian 
periods in a real cyclic extension K C Q(Cp) is studied at the end of Section 2. 
Throughout Section 2, the base field is Q but in Section 3 we allow F to be 
any totally real number field and study Lagrange resolvents constructed from 
Stark units in this more general setting. The computations we carried out over 
real quadratic base fields are discussed in Section 3. 

If K is an algebraic number field, we write Ok for the ring of algebraic 
integers in K. The letter m will always denote a fixed positive integer > 2. The 
symbol Cm will denote a given primitive mth root of unity in the algebraic closure 
of Q. The field diagram we will consider throughout has the following form, 

K{Cm) 



F{Cm) K 



F 

where we assume all of the following: 

(a) T’ is a number field, 

(b) [F(Cm):F] = </>(rn), 

(c) K/F is a cyclic Galois extension with \K ■. F] = m, 

(d) F{C,m) ^K = F. 

Let G := Gal(RT/F) = (r). The extension F{(jn)/F is relative abelian and H := 
Gal(F(Cm)/T’) — by assumption (b). As usual, the (j){m) elements 

of F[ are denoted by at, where \ < t < m, {t,m) = 1, and at{Cm) = Cm- By 
assumption (c), the extension K{Cm)/F is relative abelian. By assumption (d), 
we have an isomorphism ip : Gal(A'(Cm)/A) H xG, with ip> defined as follows: 
Given p G Gal(Ar(Cm)/^), 

i’ip) = (dlF(u)>Plic) & H xG. 

From now on, by abuse of notation, we define 

r := ip-^{ai,T), 

at.= ip~^{at,T°), l<t <m, {t,m) = l, 

as elements in Gal{K{Cm) / F). Note that G = Gal{K{Cm)/F{Cm)) = (f). 

Given an element 9 G K and a character x ^ G, we define the corresponding 
Lagrange resolvent {9, x) by 







m—1 



{0,x) ■.= J2 x{F)F{0)- 



i^O 
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(Note: Some authors use x(r*) instead of x(t*).) An element 9 £ K such that 
the discriminant dx/piS, t { 9 ), . . . , is non-zero is called a “normal basis 

element for K/F”. In the statement of the following properties of Lagrange 
resolvents, we assume that 9 G Ox is a normal basis element for K/F (the 
normal basis theorem guarantees the existence of such a 9, see [Na], p. 171). 

( 1 ) (9,x)£OxiCr.)- 

(2) (0,x)'"gOf(U). 

(3) If a G iL is expressed in the form a = where aj G F for 

0 < j < TO — 1, then {a, x) = A ■ {9, x)> where A = in 

F{U- 

(4) dx/p{9, t { 9 ), r™-i(6»)) = rixeG(^’ x)^ and so (6», x) 7 ^ 0 for all x G G. 

The proof of (1) is immediate since 9 G Ox and x{F) G Property (4) 

follows from the evaluation of the determinant of the circulant matrix whose 
first row is (6* t { 9 ) ■ ■ ■ r™“^(6*)) ([Zh], p. 107). Property (3) follows easily from 

Lemma 1. If a,b£ F and 0,01,6*2 G K, then 

(i) (o0i-f 602 ,x) = a(6'i,x) +^(^2,x); 

(ii) r-’((0,x)) = (t^(0),x) = X{r^){d,x) /or 0 < j < to - 1. 

Proof, (i) Follows directly from the definition. 

(ii) With / fixed, r^((0,x)) = x('r*)r*(0)) = 'ET=o^ x(F)F (9)) 

= (rH^),x) = X('T^) E™ 0^ x('rV^>V^(0) = x{t^){9, x). 

To prove property (2), note below that the relative norm lies in T(Cm) and B is 
a unit in Op^^^y. 

Nk(C™)/f(u)((^',x)) = I\J=o 'F{{0,X)) = B- (0,x)"", where B = UT=o^ x{r^) 
by (ii). For future use, we note that N;^;(^^)/p(^^)((0, x)) and (0, x)™ generate 
the same principal ideal in Op(^^^y 

2 Classical Gauss Sums 

Let p G N be a prime, fix a primitive root r > 0 modulo p, and let Fp denote the 
finite field with p elements. Assume that p = 1 (mod to) and let x be a fixed 
multiplicative character of order to on . Note that x is a Dirichlet character 
of conductor p and x(^) = Cm, a primitive mth root of unity, since x is of order 
TO. The corresponding classical Gauss sum is defined by 

p—1 P— 2 m—1 n—1 

Gix)-='^x{t)Cl = '^x{F)Cp ='^Qn['^Cp^'"') where p - 1 = mn. 

t—1 0 i—0 j—0 

The inner sums pi = Ej=o C/ ^ ^ , i = 0, . . . , to — 1, are known as “Gaussian 
periods”. They are the to roots of a monic irreducible polynomial fm{x) G Z[x], 
often referred to as an mth period polynomial. These Gaussian periods lie in and 
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generate the unique number field K C Q(^p) such that [K : Q] = m. Note that 
K/Q is a cyclic Galois extension with Gal(iC/Q) = (t), where t G Gal(iC/Q) is 
the restriction of the automorphism in Gal(Q(^p)/Q) sending (p >->• Q. We set 
x(r) = x(r), noting that r is the image of the principal ideal (r) C Z under the 
global Artin map. We have t]i = T^{rjo) for 0 < t < m — 1 and therefore 

m— 1 

= X! := iVo,x) is a Lagrange resolvent. 

1=0 

The Gaussian period rjo is a normal basis element for K/Q, and even better, 
we have Ok = '^[voyVii ■ ■ ■ which means that the ry’s form a “normal 

integral basis” for Ok over Z. For future use, we note that ijo = TrQ(Cp)/A:(Cp)- 
The field K belongs to the group of Dirichlet characters 

X = {xo, X, ■ • ■ , Xo is the trivial character, 

according to the terminology in Ghapter 3 of [Wa] . By the conductor- 
discriminant formula, the discriminant of the field K is given by <1k = 
since each non- trivial power of x has conductor p. The field diagram appropri- 
ate to the present section is exactly the diagram in Section 1, with F = Q and 
K = Q{rjo)- We have Q(Cm) C\ K = Q since p splits completely in Q(Cm)/Q and 
p is totally (and tamely) ramified in K/Q. Since the ry’s form a normal integral 
basis, we have 



dK/Q{VO,T{po),---,T"' ^{r]o)) = dK = p"' \ ( 1 ) 

and so the element {rjQ, x)™ G has only primes above p appearing in its 

factorization by property (4) of Lagrange resolvents in Section 1. Stickelberger’s 
classic theorem makes this precise: 

((’ 70 ,xn= n (2) 

0<t<m 
(t,m) = l 



where Pi is the unique prime ideal above p in Q(Cm) such that = 

(mod Pi) (recall that x(t) = x(^) = Cm)- A more refined result known as 
“Stickelberger’s congruence” may be formulated in K{(/m). Let Pi be the unique 
prime ideal above Pi in K{C,m), and let tt = )/k{^ ~ Cp)- Then tt is a 

uniformizer in K{(/m) with respect to Pi, and 

(r;o, x") ^ (P - 1 - kn )\ (mod Pf+^) (3) 

for fc = 1, . . . , TO — 1 (recall that n = {p — 1)/to). We note that (2) is directly 
derivable from (3). 

In order to make a comparison between the Gaussian periods and Stark units 
in K, we need to make two specializations that were not necessary above. First, 
we assume that the field K is totally real which is equivalent to assuming that x 
is an even character (or that n = {p — 1)/to is even). Second, we need to fix an 
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embedding of K into M and this is conveniently done by fixing C,p = from 

now on. The Stark unit cq for itl/Q is equal to — Qp) (see Section 

3). We note that N/f/Q(eo) = p (we used tt = eg as the uniformizer above in 
equation (3)). The relationship between cq and rj^ may be found as follows. Let 

g{x) = (“!)■’ ’toting that g(l — Cp) = 0. When we factor 

g{x) over K[x] (it is best to define K using the period polynomial /m(a^)), the 
polynomial 

a;” + ivo - H h cq G Ok[x] 

appears as one of the m factors. These m factors are conjugate under the action of 
Gal(iL/Q), which allows us to easily obtain the expression cq = ivo), 

where qq, , Qm-i G Z- 

Example: Let p = 41, m = 4, n = 10, r = 6, and x(^) = *• Then cq = 
— 44?7o — 23r(?7o) — 27r^(?7o) — 29 t^(?7o). By property (3) of Lagrange resolvents 
in Section 1, (eo,x) = (—17 — 6i){r]o,x). Therefore 

{eo,x)*=plpt3-PiPi" 

and since (—17 — 6f,41) = 1, we still have the classic Stickelberger factorization 
above p = 41. 

Question: Given the above set-up, we know that (cq, x) = ^ • ivo, x) for some 
A G Z[Cm]- Is (A,p) = 1? 

We have tested this question separately for m = 2,3,4, and 6, where we 
allow p to range over every prime such that 1 < p < 5000 and p = 1 
(mod 2m), and the answer has been found to be “yes” in each case. No clear 
pattern has emerged yet for the primes dividing A, but A is often divisible 
by many different primes. For example, given m = 6, p = 3457, r = 7, and 
x(r) = p = we found that the corresponding algebraic integer A G Z[p] 

was divisible by prime ideals above 2, 3, 13, 31, 7069, 31147, 253681, 4757746957, 
and 81711780069590612172532572288033926813410849. When the answer to this 
question is yes, then cq is a “good substitute” for po since you still have the classic 
Stickelberger factorization over p with (cq, x)™ replacing (po, x)™- This question 
is of interest since a Stark unit eg is conjectured to exist when F Q but there 
doesn’t seem to be a known analogue for po when F yf Q. 

3 Stark Units 

The first goal in this section is to find an appropriate generalization of the set-up 
in Section 2, where we now allow F yf Q, and replace the Gaussian periods by 
Stark units. The following assumptions will be made throughout the remainder 
of the paper: 

(i) F is totally real with [F : Q] = t > 1, so there are t distinct real embeddings 
of F into K. Let F*^®^ , for I < i < t, denote the image of F inside K under the 
i th embedding. The embedding for i = 1 plays a distinguished role in the 
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following and we remark that any one of the t embeddings may be chosen at 
the beginning as this “first embedding”. Let denote the infinite prime 
corresponding to the ith embedding and let G K denote the image of 
a G F under this embedding. 

(ii) [FiCm) ■■ F] = 

(iii) A prime p G N is fixed that splits completely in F{(^rn), so p = 1 (mod m). 

(iv) We fix a prime ideal p C Of lying over p. We have Np = p, and p splits 
completely in F{(^rn) by (iii). We also fix a primitive root r > 0 modulo p 
which is therefore also a primitive root modulo p. 

(v) Let J := G(ppii^ • • -p^^) denote the narrow ray class group modulo p. If y is 

a character defined on this group, then the conductor f(y) of y has a “finite” 
component and an “infinite” component f^,oo- Both of these components 
are completely determined by the behavior of y on the principal ideals in 
Op generated by elements relatively prime to p. The finite component will 
be either (1) or p; it is p if and only if there exists a totally positive element 
a G Op such that (a,p) = 1 and y((a)) yf 1. The infinite prime p^ will 
appear in f^_oo if and only if y((/3)) yf 1 for /3 G Op satisfying /3 = 1 
(mod p), > 0 when j yf i, and (}F) < Q. We assume there is a character 

y G J having the following properties: 

(a) The conductor of y has the form f(y) = fxfx,oo = pp^^ ■ ■ • pS- 

(b) The order of y is m. 

(c) We have = P for 1 < j < m — 1. This condition implies that y((r)) 
is a primitive mth root of unity, denoted by (m from now on, where (r) 
is the principal ideal in Op generated by the primitive root r fixed in 
(iv). 

If A = Q, these assumptions bring us back directly to the set-up at the end of 
Section 2 (a Dirichlet character y of conductor p has conductor f(y) = (p) or 
f(x) = (p)p~ when considered as a ray class group character, depending upon 
whether y is even, or odd, respectively). Our assumption that F be totally real 
is not completely necessary (the case where F is a complex quadratic field, for 
example, is also interesting), but Stark’s conjecture has a particularly striking 
form over a totally real field so we keep to this assumption. Assumptions (ii), 
(iii), (iv), (v)(b), and (v)(c) are all made in order to be consistent with the set- 
up in Section 2. The special shape of f(y) in part (v)(a) does, however, require 
some explanation and for this we recall some basic facts from class field theory 
as well as the conjectured form of the corresponding Stark unit. 

Let X = {yo, y, • ■ • , y™”^} be the group of ray class group characters gener- 
ated by a character y satisfying conditions (v)(a) — (c). From class field theory, 
we know there exists a uniquely defined extension field K over F belonging to 
the group X of ray class group characters and having the following properties: 

• K/F is a cyclic Galois extension with |Gal(AT/F)| = |X| = m. If t G 
Gal(AT/F) is the automorphism corresponding to the ideal (r) in (v)(c) under 
the global Artin map, then Gal(AT/F) = (r). We set y(r) = y((r)) = Cm- 




432 



B.A. Tangedal 



• The relative discriminant is d{K/F) = by the conductor-discriminant 

formula and assumption (v)(c). The only prime ideal in Op that ramifies in 
K is p, this ramification being total and tame. Let fp C Ok be the unique 
prime above p in K. Since p splits completely in as noted in (iv), we 

see that HiL = F, and so conditions (a)-(d) in Section 1 are satisfied. 

• The infinite prime pi^^ splits completely in K which means that every em- 
bedding of K into C extending the embedding F C M is a real 

embedding. 

• Each infinite prime in the set {pS\ • • ■ ,pS} (this set is empty if F = Q) 
ramifies in K, i.e., every embedding of K into C extending an embedding 
F ^ F*^^) C K, 2 < j < t, is non-real. This implies that 2 | m if F Q (m 
being odd is still interesting if F = Q) . 

• A prime ideal q C Op splits completely in K/F if and only if x(q) = 1. 
This exact characterization of completely split primes uniquely determines 
the extension K/F by a theorem of Bauer (see [Ja], Cor. 5.5, p. 136). 

Let 5 = {pii\...,pS , p}, and define for each ■;/) G X the corresponding 
F- function: 

qyS 

q prime 

These infinite products converge only for 5ft(s) > 1, but there exists a meromor- 
phic continuation of Ls{s,ip) to all of C (the extended function is still denoted 
by Ls{s, tp)), and each of these extended functions is analytic at s = 0. If F = Q, 
each function Ls{s,'tp) here has exactly a first order zero at s = 0. If F Q, 
the order of the zero at s = 0 of each Ls{s, iji), denoted by r(f/'), is given by the 
following formula because of the conditions that x satisfies in (v)(a) and (v)(c) 
(see [DT]): 



f{x^) = 1 for j odd; (4) 

r(x^) =t>2 for j even. (5) 

The most direct way to express Stark’s conjecture is via partial ^-functions. 
These are defined here by 

Cs(s,t'”) = — V V’((r)'’’)Fs(s,'i/') for 0 < fc < to - 1, 

TO 

where again, t G Gal(F/F) is the automorphism corresponding to the ideal (r) 
under the global Artin map. Note that each of these partial ^-functions has at 
least a first order zero at s = 0 and that Cg(0,T^) G M for all k. Indeed, by our 
special choice of a character x satisfying the conditions in (v)(a) — (c), it can be 
shown that 0 iri all cases. Since we are directly concerned with first 

derivatives of partial ^-functions at s = 0 in Stark’s conjecture, the assumptions 
we make regarding x now have an explanation. 
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Let K be the extension field of F belonging to the group X as described above 
and fix an embedding K C M extending the distinguished embedding 

F ^ C K. 

Stark’s Conjecture ([St2], [Ta]): There exists a unique algebraic integer 
eo G Ok (called the “Stark unit for K/F”) such that 

(r'=(eo))(^^ = e-2fs(o.-'‘) for A: = 0, . . . , m - 1. (6) 

It is also conjectured that if eo G Ok satisfies (6), then K{^/eo)/F is an abelian 
extension. The algebraic integer eg we defined in Section 2 when F = Q satisfies 
this conjecture and indeed Stark’s conjecture has been proved in full over the 
base field Q [St2]. However, if F is a totally real field with [F : Q] > 1, then 
only a few special cases of the conjecture have been proved (see, for example, 
[DST] and [Sh]). The Stark unit eo in Section 2 was not an actual unit (recall 
that N;^:/Q(eo) = p), but if F is a totally real field other than Q then the Stark 
unit is conjectured to always be an element of O^ and therefore a “true” unit. 

We assume throughout the remainder of the paper that F yf Q. Recall that 
2 I [FT : F ] in this setting. Let M denote the unique intermediate field F C 
M C K such that [FT : M ] = 2 and let p = which is the non-trivial 

automorphism in Gal(FT/M). We have (^)> (^)’ 

also since x^{p) = for j odd. This identity among partial ^-function values 
has two important consequences for the conjectured Stark unit: 

d(eo) = - ; (7) 

eo 

and 



|(r'”(eo))^'^^| = 1 for fc = 0, . . . , m - 1, (8) 

for any embedding K ^ C C not extending F ^ F^^'> C K, since 

(p(a))(f) = for all a G FT for such an embedding. If cq G O^ satisfies 
(6), then by Theorem I of [Stl] we have F' = F(eo) and even K = Q(eo) since 
L'g(0,x-^) 0 for j odd, as noted in (4). Given (6), (8), and the values 

to sufficient accuracy, we may compute a unique candidate for the minimal poly- 
nomial 

m— 1 

Ao(x) = n “ ^'"(eo)) G Of[x] 
k=0 

of eo over F after a finite search which is often surprisingly efficient (see [Go], 
p. 309). Once fe^ix) has been computed, it can be proved independently that a 
root of feo(x) generates the unique class field FT over F belonging to the group 
X. This explicit construction of the class field FT arises directly from the values at 
s = 0 of the meromorphic functions ('g(s,r^), and these values are computable 
using only data from the base field F. This gives a beautiful supplement to 
the classical existence proofs of class field theory and demonstrates why Stark’s 
conjecture gives at least a partial response to Hilbert’s twelfth problem. 
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We now let the Stark unit eo play the role that the Gaussian period rjQ played 
in Section 2 and study how much from the relations in equations (1), (2), and (3) 
of Section 2 remains true with respect to the Stark unit. Equation (1) is already 
quite special and it is not guaranteed at all that there exists a normal integral 
basis for Ok over Op- The following proposition shows that there does exist a 
normal p-integral basis for a field extension K/F satisfying our assumptions. Let 
Wp( ) denote the exact power that a prime p divides a given element. 

Proposition 1. There exists an element a G Ok such that 

Vp{dK/F{a,T{a), . . . ,T™“^(a))) = m - 1. 

Proof. We recall that the ramification of fp over p is tame, which is crucial 
for the application of Noether’s theorem below. Let Krp and Fp denote the 
corresponding completions with valuation rings Orp and Op, respectively. For 
ease of notation, the symbols ip and p are also used to denote the maximal 
ideals in 0<p and Op, respectively. Note that Gal{K<p/Fp) = Gal{K/F) and let 
r' G Gal{K<p / Fp) be the unique automorphism that restricts to r G Gal{K/F). 
By Noether’s theorem [Ch], we know there exists an element 9 G 0<p such 
that Vp{dK,p/Ff,{d,T'{9), . . . ,t'"^~^( 9))) = m — 1. By the strong approximation 
theorem ([Ha], p. 379), there exists an a G Ok such that a = 9 (mod fp™^). Set 
aj = (a) and 9j = (0) for j = 0, . . . , to — 1. We have 

a^aj = 9i9j (mod and Trjf/^(a*aj) = (mod p™) 

for f = 0, . . . , TO — 1; j = 0, . . . , TO — 1. Therefore 

dK/F{a,T{a), . . . ,r'^~\a)) = (0, t'( 6»), . . . , (mod p""), 

from which the proposition follows immediately. 

For a Stark unit eo> we have in general Vp{dK/F{^o,T{^o), ■ ■ ■ > 

TO — 1, with equality usually holding. We will return to this point later in this 
section. In regard to equation (2) in Section 2, we computed the prime factoriza- 
tion of the ideal ((eo, x)™) in ^®r hundreds of examples. We will see later 

that it is possible to have Wp(dF:/F(eO) '''(eo), ■ . ■ , r'"“^(eo))) > to — 1 and still 
have the factorization of ((eo,x)’") above p take the classic Stickelberger form. 
Before we state the results of our computations, we first consider what can be 
said about the general shape of the prime factorization of (a, x) in ^K{Cm)’ when 
a. G Ok is an arbitrary normal basis element for K/F. In direct analogy to the 
definitions given in Section 2, let P\ be the unique prime ideal in OF(Cm) Etbove 
p C Of such that (mod Pi), let Pi be the unique prime ideal 

above Pi in K{C,ra), and let tt G K{C,ra) be a uniformizer with respect to Pi. 
Recall that G := Gal(iL(Cm)/T'(Cm)) = (t) and note that the congruence class 
of t(7t) /tt modulo Pi is independent of the choice of tt since the inertia group of 
the prime ideal Pi over P(Cm) is equal to G. 
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Lemma 2. We have 

^ ^(p-l)/ra 
7T 

Proof. The congruence {rfT’i)m = (mod Vi) holds for the mth power 

residue symbol by definition. Let Fp^ denote the completion of F(Cm) with 
respect to Pi and let Kp^ denote the completion of Lf(Cm) with respect to Pi. 
The extension Kp^/Fp^ is tame and totally ramified of relative degree m. If r' G 
Gal(Kp^/Fp^) is the unique automorphism restricting to t G G, we just need to 
prove T'{n)/n = (mod Vi) for any 77 G Kp^ such that vp^{Fl) = 1. 

By theorem 5.11 in [Na], there exists an element FI G TC-p^ with vp^{FI) = 1 
such that ip := Fl^ G Fp^ satisfies vpj^{ip) = 1. The restriction of the local Artin 
symbol {r~^ ,Kp^/Fp^) G Gal(7Cpj/FpJ to G is equal to the global Artin symbol 
{{r),K{fm)/F{(pm)) = T and therefore t'{II)/F[ = ,Kp^/ Fpf){II) / II . The 

latter expression is equal to the Hilbert symbol (^-p^). By the basic properties 
of the Hilbert symbol ([Ne], p. 334), we have 

and the last expression is equal to {r/Pi)m ([Ne], p. 415), completing the proof. 



Theorem 1. Given an arbitrary normal basis element a G Ok for K/F, the 
factorization of the ideal ((a,x)) in Ok(q^) is given by 

i{wx))={i) n i^tH'Pi)^, 

0<t<m 
(t,m) = l 



where (7) is the lift to Ok(c^) of an integral ideal I C Op(^Q^) relatively prime 
to p, and Ct G N satisfies Ct = t (mod m) for all t as above. 



Proof. Fix k in the range 1 < fc < m — 1. By a calculation similar to the proof 
of Lemma l(ii), r((a,x^)) = Therefore 



r((g,x^)) 

(a,x'=) 



^ ^ (iM)" (n,od Vi), 



where the first congruence holds by the choice of Vi, and the second congru- 
ence holds by Lemma 2. By Discussion (4.4) in [Br], we note that the congru- 
ence T((Q;,x^))/(a, X*) = (''■(’’■) (mod T’l) implies that upj((a,x^)) = k 
(mod to). For a fixed t, we have at{{a, x)) = (o, X*)> so 



Ct := ?;^-i(p^)((a,x)) = i (mod to). 

We now prove the statement about (7). Given a prime ideal Q C that 

does not lie over p, we know that Q is unramified in K{(pm), say (Q) = Qi ■ ■ ■ Qg. 
By Lemma l(ii), r fixes the ideal ((a, x)) C OK{c,m) - transitive action of 

G on {Qi, . . . , Qg}, VQ^ ((a, x)) = ’ ‘ ‘ "OQ^Ha, x)), which completes the proof. 
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Corollary 1. In OF(Cm)’ have the ideal factorization 

((a,xD = /™ n 

0<t<m 

with a, I, and the same exponents Ct as in Theorem 1. 

Proof. Take the relative norm of the factorization in Theorem 1, recalling from 
Section 1 that x)) and (a,x)’” generate the same principal 

ideal in Of(c„)- 



Note: From Corollary 1 and the congruence Ct = t (mod m) for each t, we see 
that the class of the product no<t<m the ideal class group of -F(Cm) 

(t,Tn) = l 

is an mth power. 



Corollary 2. If a G Ok is such that Vp{dK/F{c(j ■ • ■ ) = rn—1, 

as in Proposition 1, then Ct = t for each t, i.e., the factorization of ((a,x)’”) 
above p takes the classic Stickelberger form. 



Proof. Set Qfi = T*(a) for 0 < z < m — 1, and write ai = ^^=0 (mod P™), 

where 5^ G {0, ... ,p — 1}. Let B = (bij), 0 < z, j < m — 1, be the corresponding 
matrix. We also assume here that tt G Ok is a uniformizer with respect to fp and 
note that tt is a uniformizer in with respect to any prime ideal lying over 

*p. Given this choice of tt, we have Vp{dK/F{^,T^^ ■ ■ ■ = Vp{d{K/F)) = 

m — 1, and since Vp{dK/F{oteh czi, . . . , Om-i)) = zn — 1 by assumption, it follows 
that p\ det(B). 



= (s*^) for 0 < z, j < m — 1, where s 




( \ 


= V 


^ CZg ^ 
Oil 


= VB 


( ^ ^ 
7T 


(mod pr), 






\ — 1 j 









where there are m congruences to be read along the rows. The matrix V occurs in 
the theory of Fourier transforms over Fp [Po] and appears here when we replace 
the mth roots of unity in the Lagrange resolvents by the appropriate powers of 
the primitive root r modulo Let W = VB, and note that the matrix W 
is upper-triangular since WFj^((a,x^)) > k for 0 < fc < m — 1, as we saw in the 
proof of Theorem 1. Since p | det(W), all of the diagonal elements of W are 
non-zero and so zzF^((a,x^)) = k for 0 < k < m — 1. This implies in turn that 
Ct = t for each t. 



We now turn to Lagrange resolvents constructed from a Stark unit eg. Recall 
from equation (7) that p(eo) = ^ for an element p G Gal{K/F). The inertia 
group of the prime ideal ^ over F is equal to Gal{K/F), and so ^ = eg (mod fp) 
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by definition of the inertia group, or Eq = 1 (mod fp). From now on, we let 
a denote the unique choice of —1 or 1 such that W(p(eo — a) > 0. The case 
where ftp (eg — a) = 1 is particularly nice since we can make a “canonical” 
choice for tt with tt := — a. We assume this holds for now, and prove a 

partial analogue to the Stickelberger congruence (3) in Section 2 (recall that 
Stickelberger’s congruence depends critically upon the choice of the uniformizer) . 
In the discussion below, we set €k = f^(eo) for fc = 0, . . . , m — 1. 

Theorem 2. Assume that f<p(eo — a) = 1 and set tt = eg — a . Then 

(i) (eo,X°) = Tna (mod Vi), 

(ii) (eo,X^) = W7T (mod Vl). 

Proof, (i) This is immediate since Ck = a (mod Vi) for fc = 0, . . . , to — 1, and 
(cO) X°) = Co + £l + • • • + Cm-l- 

(ii) Clearly eg = a + 1 • tt (mod Vi), and by Lemma 2 we have 
ei = a + • tt (mod Vf) 

€2 = 0 + • TT (mod rf) 



,_i = o + r(’”-^)(P-i)/'"-7r {modVf). 



We have (eg,x) = eg + CmCi H h C and also Cm = 

(mod P{"). Therefore 



(eo,x) = 



1 _|_ , ^{p-l)/m _| 



_|_ j-ip-PI-m . ^(m-l)(p-l)/m 
= TO7T (mod Vl), 



TT (mod Vf) 



which concludes the proof of (ii) . 

We also observed that (eg, x^) = (^)7t^ (mod Vf) in the setting of Theorem 
2, when to > 4. This last congruence seems to be directly tied to the equality 
em /2 = 1/eg, but we have as yet no proof. What we observed with to > 4 in 
terms of the matrix W = (rUy), 0 < z, j < to — 1, that was introduced in the 
proof of Corollary 2 was the following pattern (recall that W is upper-triangular 
and TT = eg — a is the uniformizer): 



/ TOO 0 0 0 



W = 



T * * 

am am 
2 2 
* 






V 
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where an asterisk indicates that several different possible values were observed in 
these positions with respect to certain constraints. The one constraint in column 
4, for example, with the two undetermined coefficients there was that the sum 
going down the column was always congruent to 0 modulo p. Therefore, the Wi^ 
and W33 entries depend directly upon each other. The following table summarizes 
some systematic computations we carried out when m = 4. If ■u;33 yf 0, then 
((eo,x)"*) has the classic Stickelberger factorization above p. We computed 90 
examples with F real quadratic, discriminant range 5 < dp < 5000, Np = 29, 
and TO = 4. In 3 of these examples, ftp (eg — a) = 3. In the remaining 87 examples, 
we have Wfp(eo — a) = 1. The table below covers these 87 examples, where the 
second row gives the frequency that W33 was equal to the value in the first row. 



0 


1 


2 


3 


4 


5 
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7 


8 


9 


10 


11 


12 


13 


14 


15 


16 


17 


18 


5 


1 
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5 


1 


1 
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5 


4 


5 


4 



19 


20 


21 


22 


23 


24 


25 


26 


27 


28 


3 


3 


2 


3 


3 


3 


3 


3 


2 


3 



Thus W33 was non-zero in 82 of the 87 examples in the table. In terms of the 
exponents appearing in both Theorem I and Corollary 1, we have the following 
possibilities when to = 4: 

(i) i’'p(eo - a) = 1, W 33 yf 0: Cl = 1, C3 = 3. 

(ii) U(p(eo - a) = 1, W33 = 0: ci = 1, C3 > 3. 

(hi) u<p(eo - a) = 3: Cl > 1, C3 = 3. 

(iv) C(p(eo - a) > 3: Cl > 1, C3 > 3. 

Case (i) is the dominant one and gives the classic Stickelberger factorization 
above p. Having v<p(eo — a) = 2 is not possible as we’ll see below. Case (iv) was 
only observed once out of hundreds of different examples. 

In order to obtain some understanding for the observed frequencies when 
TO = 4, we consider the following model which runs along similar lines to the 
notation and method used in the proof of Corollary 2. Let the indeterminate 2: 
play the role of a fixed uniformizer tt G Ok with respect to ip. The congruence 
classes modulo Vf form a ring isomorphic to the quotient ring TZ := Wp[z]/{z'^). 
Let 7o = a+boiz + ho 2 Z^ + bQ 3 z'^ + {z'^) G TZ (we drop the +{z'^) below). We define 
a map s : 77. — >■ 77. as follows: s is trivial on Fp and s : z 1 -^ zFe^z^ + e.3Z^ . 

This map s models the action of t on tt recorded in Lemma 2. The elements 
71 j 72,73 G 77 defined below are now uniquely determined: 

71 = s(7o) = a + biiz + 6122^ -k 6132^ 

72 = s(7i) = a + b 2 iz + b 22 Z^ + b 2 sz^ 

73 = 5(72) = a + b3iz + 6322^ -k b33Z^. 

Let B = (bij), 0 < i,j < 3, as before, where = a for 0 < t < 3. We 
now compute the matrix W = VB (V is the fixed matrix defined in the proof 
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of Corollary 2 ) as we vary the 5 parameters 601,602,^03)62, and 63 among the 
values { 0 , 1 , ... ,p — 1 }. The special form of the map s : TZ ^ TZ guarantees that 
we have 5(73) = 70 and that W is upper-triangular for any combination of values 
for the 5 parameters above. In order to have 70,71,72,73 model the behavior 
of 60,61,62,63, we count how many times a given possibility occurs under the 
constraint that we must have 7072 = 7173 = I € TZ (there does not seem to 
be any other constraint on the c’s) as we let the 5 parameters vary through all 
possible values. The following counts were obtained with a = 1 (the same counts 
hold for a = —1): 

(z)' 601 yf 0, W33 yf 0: count = (p - 1) V- 
{ay 601 yf 0, W33 = 0: count = (p - l)p^. 

(*) 601 = 0, 602 yf 0: count = 0. 

{my 601 = 602 = 0, 603 yf 0: count = (p - l)p^. 

(iv)' 601 = 602 = 603 = 0: count = p^. 

Cases (z)', (zz)', {Hi)', and (zz;)' correspond to the cases (i), (ii), (iii), (iv) defined 
above. This model shows a good agreement to the results of the computations 
discussed above. The fact that we obtain a count of 0 in case (*) shows why 
6’<p(6o — a) = 2 is not possible. We also noted with these counts that in the 
case where 601 yf 0 we find that W33 takes each value in the set {0, 1, ... ,p — 1} 
equally often. 

We noted earlier that the valuation Vp{dj^/p{eo, ... ,6^-1)) is occasionally 
greater than m — 1 . If this valuation is equal to m — 1 , then the factorization of 
((60, x)™) above p takes the classic Stickelberger form as noted in Corollary 2 . 
The condition for having the classic Stickelberger factorization above p is actually 
weaker, namely, we have this special factorization if and only if z;qj(6o — a) = 1 
and Wtt yf 0 for all 0 < t < m with {t, m) = 1 when using tt = 6q — a as 
the uniformizer with respect to fp. For example, if m = 6 and tt = 6q — a is a 
uniformizer with respect to fp, then we can have 1044 = 0 and zcss yf 0 which 
implies that the above valuation is greater than 5 but we still have the classic 
Stickelberger factorization above p (we have wn y^ 0 by Theorem 2 (ii)). The 
table below summarizes some systematic computations we carried out when 
m = 6. We computed 61 examples with F real quadratic, discriminant range 
5 < dp < 5900 , Np = 19 , and to = 6. In 4 of these examples, u?p(6o — o) = 3 . In 
the remaining 57 examples, we have wqj(6o — a) = 1 . The table below covers these 
57 examples, where the second row gives the frequency that W55 was equal to 
the value in the first row. Note that was non-zero in 54 of the 57 examples in 
the table. All of our computations were carried out with the package PARI/GP 
[GP] and the method used for computing the first derivatives of the partial 
zeta-functions at s = 0 is based upon the Lavrik-Friedman formula (see [DT]). 
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We applied a chi-square test to the data in each table under the null hypothesis 
that the distribution into the different congruence classes was uniform. In both 
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cases, for the value a = 0.05, the null hypothesis was not rejected (we were far 
from the critical value in both cases) . In view of the fact that there are zeros in 
some of the cells, it might be more appropriate to apply the exact permutation 
test than the chi-square test but since the margin was quite safe in both cases 
we believe the same conclusion would have been reached. 
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Cryptanalysis of a Divisor Class Group Based 
Public-Key Cryptosystem 
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Abstract. We prove that the public- key cryptosystem proposed by 
H. Kim and S. Moon, in “New Public-Key Cryptosystem Using Divisor 
Class Groups” , ACISP 2001 [1] is not secure by solving discrete logarithm 
problem for the divisor class groups used in the proposal. 



1 Introduction 

In [1], H. Kim and S. Moon proposed a public-key cryptosystem based on ideal 
arithmetic in the divisor class group of an affine normal subring of F[A, Y] for 
some field F. The security of the system is based on the difficulty of the discrete 
logarithm problem in the divisor class group. There were many proposals of 
cryptosystems based on divisor class groups of imaginary quadratic orders of 
quadratic fields, but as far as we know, that was the only proposal based on 
divisor class groups of polynomial rings. 

The authors of the proposal suggest that the discrete logarithm problem for 
the divisor class group is much more difficult than that of the class group of an 
order of a quadratic number field. But in this paper we show that actually, there 
is a ‘standard’ generator [P] for the divisor class group, so that each element 
[/] can be written as [P]^ for some k, and moreover the representation for the 
element [/] contains the information about the exponent k rather transparently. 

2 Overview of the Scheme 

For mathematical backgrounds and complete treatments, see the original pro- 
posal [1]. The goal of this section is mostly to fix notations. 

2.1 General Definitions 

Throughout this article, F will denote a fixed base field. 

Let R be an integral domain, and K its quotient field. Then K is an R- 
module. If J is an P-submodule of K, and if xJ C R for some nonzero x G R, 
then we say that J is a fractional ideal. 

When I, J are two fractional ideals, we can multiply and divide them as: 

/ J = { ^ a/3 I a G J, /? G J} 

finite 

D.A. Buell (Ed.): ANTS 2004, LNCS 3076, pp. 442-450, 2004. 
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I-.J={x&K\xJ<ZI} 

Especially we denote by the ideal quotient R : /, and we define the 
‘divisorial closure’ / = We say that a fractional ideal I is divisorial if 

1 = 1. Note that I is equal to the intersection of all principal fractional ideals 
containing I. 

We denote by T>{R) the set of divisorial ideals. It becomes a commutative 
monoid when we define the product / * J to be / • J, where / • J is the (fractional) 
ideal product. 

In general 'D{R) is not a group, and it becomes a group if and only if R is 
completely integrally closed. It will be the case for the rings of type R = i?„j- 
which we will introduce shortly, so then we can define our divisor class group as 
follows: 

We define the divisor class group Cl{R) to be the quotient T>{R) /P{R), where 
P(i?) is the subgroup of the principal fractional ideals. We say that an element 
of Cl{R) is a divisor class, and when J is a divisorial ideal, then we denote its 
divisor class by [J]. 

2.2 Cryptosystem Proposed by Kim and Moon 

Kim and Moon proposed [1] a public-key cryptosytem based on divisor class 
groups of certain polynomial rings. Its security is based on the difficulty of the 
discrete logarithm problem of the divisor class group, and the Grobner basis 
computation is used as the basic tool for ideal computation. 

For motivation of actual candidates for such polynomial rings, they refer to 
a theorem proved by D. F. Anderson in [2]: 

Theorem 2.1 (Theorem 2.5 of [2]). Let R be an affine integrally closed sub- 
ring of T = F[A, y] generated by monomials, where V is a field. Moreover sup- 
pose that T is integral over R. Then R is isomorphic either to T itself or to 

Rn,j :=¥[X'^,XY^,X^y'^,... , A"-iy(”-i)^ K”], 

where 0 < j < n and gcd{j,n) = 1. Here m denotes the smallest representative 
in N of the eongruence class of m modulo n. 

Since Rn,j is integrally closed Noetherian domain, it is a Krull domain by 
Mori-Nagata Theorem ([4]). So it is completely integrally closed and we may 
talk about its divisor class group Cl{Rnj). D. F. Anderson also gives the divisor 
class group computation for such rings: 

Theorem 2.2 (Theorem 4.4 and its proof of [2]). Let’s use the same no- 
tations as in the Theorem 2.1. 

Then Cl{Rn,j) is isomorphic to Jjlnlj. 

Moreover if we define the prime ideal 

P= {X"^ ,XY\X'^Y^\ . . . 

then P is a generator of the divisor class group Cl{Rnj). 
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Since the general Rn,j is rather unwieldy for the Grobner basis computation, 
Kim and Moon use Rn.i = F[Ai", XY, Y"] as the ring of choice. Their cryptosys- 
tem can be defined simply as ElGamal and Diffie-Hellman scheme in the divisor 
class group Cl{Rn,i)- 

In this paper we prove most results for the general case Rn,j, and for the 
special case j = 1 we provide the computation of the complete structure of the 
cyclic group Cl{Rn,i) in Proposition 3.4 of Section 3. 

For the rest of the article, we will maintain the notations given in this section. 
So j will be a fixed integer satisfying 0 < j < n, gcd(j, n) = 1, etc. When n and 
j are clear, often we’ll abbreviate Rn,j simply as R. In that case K will denote 
its quotient field. 

3 Preparation 

First, we characterize polynomials of F[AT, Y] which are elements of Rnj as 
follows: 

Proposition 3.1. For any f{X,Y) G F[AT, Y], f{X,Y) G Rnj if and only if 
every monomial X°‘Y^ occuring in f{X,Y) satisfies b = aj (mod n). 

Proof. Recall that 

Rn,j :=¥[X^,XY\X‘^Y^,... , Y”]. 

Fach of the monomials AT", XY^ , X^Y^^ , . . . , is of the form in 

the statement of the proposition. Therefore their products are again monomials 
of required form. Then any polynomial of Rnj will be a linear combination of 
monomials of required form. 

Now we prove the other direction. We need only to prove that if X°'Y^ 
satisfies the relation b = aj (mod n), then Jf“Y^ G Rnj- 

First, let’s divide a by n, and denote the quotient and the remainder by q 
and r, respectively. Then a = qn + r. So, 

X“Y'> = (a:")« • X'^YK 

But X" G Rn,j- So we need only to show that X''Y^ G Rn,j- 

Therefore without loss of generality we may assume from the start that 0 < 
a < n. 

By definition, aj = aj = b (mod n) and b > aj. Hence b — a j = ni for some 
z > 0. Then, 



X“Y^ = = X“Y“^' • (Y”)* G Rnj. 



□ 

Next, we need to characterize those polynomials of F[Y, Y] which are ele- 
ments of K, the quotient field of i? = Rnj- 
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Proposition 3.2. We have 



Kf^¥[X,Y] = R. 

Proof. Computational proof is possible, but since Rn,j is integrally closed and 
F[X, F] is integral over Rnj, any element of K nF[Jf, F] should be in R. □ 

Recall that from the Theorem 2.2, 



Lemma 3.1. We have 



X" G PF 

Proof. Since P^ is the intersection of all principal fractional ideals containing 
P^, in order to show that X” G P^, we need to prove that whenever x G K 
satisfies x • P^ C P, we have x ■ X" G R. 

Suppose that x G K is such an element. Since X^F^^ G P^, we have 
X ■ X^Y^^ G R. So we can write x as f / X^Y^^ for some f G R. 

And since X"^ G P*, 



x-X"'= 



/ 

X^Ykj 



j^nk 



G R. 



Therefore X^F^-1 | / • X"^ in the polynomial ring F[X, F]. Then 

Y^i I / • Since F^-1 and are mutually prime, we get F^-1 | /. 

Now we have to show that x ■ X" G R. But, 



x-X” 



/ 

XkY^j 



•X" 



ykj 



X" 



fc 



Since f jY^^ and X” ^ are in F[X, F], x-X” is in XnF[X, Y] = R. Therefore 



X” G Pk. 



□ 



Throughout this article, when 51 , . . . , gr are elements of Rnj C F[X, F], 
gcd( 5 i, . . . ,gr) is the greatest common divisor of 51 , . . . , in the polynomial 
ring F[X, F]. If S is an arbitrary subset of Rnj (or more generally of F[X, F]), 
then gcd(S') is defined as the greateast common divisor of elements of S. Note 
that if / is an ideal of Rnj generated by gi, . . . , gr G Rnj, then gcd(/) = 
gcd((/i, . . . ,gr). Since Rnj is an affine F-algebra, every ideal I of Rnj is finitely 
generated, so gcd(/) is well defined. 



Proposition 3.3. We have 



gcd{pk) = xk. 
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Proof. Since X" and are in P^, gcd(P^) divides gcd(X”, = X^. 

So we need only to prove that X^ divides any element of P^ . 

First, we claim that 



■^n 

= -P^ CR. 

Any element of P^ is of form F 1 F 2 ■ ■ ■ Fk, where each Fi is one of X”, XY^ , 
X^y^-1, . . . , Now, in case any of Fi is X", then clearly 

■y^n 

_ . F 1 F 2 • • • Ffc G X n F[X, Y] = R. 



XkY^j 

In case none of Fi is X", then each Fi has at least one factor X, therefore 
X^ I F 1 F 2 ■ ■ ■ Fk- Therefore again the above containment is true. 

Now suppose that / is any element of Pk. Then since Y'^ j (X^Y^^') • P* C P, 
we have 



= • / G P. 

XkYkj 

Hence X^ | . f. From this it follows that X^ | /. □ 

The above proposition gives much information about the ideal Pk. Since the 
divisor class [P] is a generator of the divisor class group Cl{Rn,j), we will use 
this to analyze the cryptosystem proposed by Kim and Moon in the next section. 

For the special case j = 1, we can say even further and actually it is possible 
to explicitly compute P^, thereby determining the complete structure of the 
divisor class group Cl{Rn,i) of order n: 

Proposition 3.4. Let j = 1. Then R = Rn,i = F[X”,XF, F”]. Again let P he 
the prime ideal (X",XF). Then 

R-P' = { ^^]c^n kind pk = {X^,{XYf) 

Proof. Suppose that f/g G P : P’^. Then f/g ■ P'^ C R. Since (XF)* G P^, 
f/g- (XF)^ = h for some h G F[X",XF, F”]. Then f/g = h/{XY)^. Therefore 
we may assume that g = (XF)^. 

Also, since (X”)^ G P^, f /{XY)^ ■ X"* = hi for some hi G P. Simplifying, 
we have 

/•X("-i)'= = hi-Y'^ 

Therefore F* | /. But / is in F[X",XF, F"]. Each monomial occuring in / is 
then a product of X”, XF, and F”’s. Collect all monomials of / which contain 
F", and we denote it by /i -F”. This part is divisible by F^. Now, since / should 




Cryptanalysis of a Divisor Class Group Based Public-Key Cryptosystem 447 



be divisible by Y^, rest of the monomials should contain at least k factors of XY. 
Therefore if we collect all of them, this can be written as /2 • {XY)^ . Therefore 

f = h-Y'^ + h-{XYf for some /i , /2 G i?. 



Then, f /g is equal to 



/ 

9 



fi-Y^ + f2- (XY)>^ 



fi- 

fi- 

ll- 



{XY)>^ 

'Yn 

h f2 • 1 

{xyy 

vn\xn 

h f2 * 1 






+ /2-1 



Therefore R : C ((XF)” And it is trivial to check that both 

and 1 are in R : . So R : = ((AF)”-'=/A”, 1). 

Now let’s prove that the inverse of ((AF)"“^/A”, 1) is (X",(AF)*). 
Suppose that f/g is in the inverse of ((AF)"“^/A”, 1). Then by definition 
f j g-{XY)'^~^ f X'^ G R and f fg-1 G R. Therefore again without loss of generality 
we may let g = 1- Then / • (AF)"“^/X" = h for some h G R = F[X", XY, F"]. 
Simplifying, / • F"“^ = h ■ X^. So X^ \ f. As before, collect all terms occuring 
in / which contain at least one A" and write /i • A". Then for the rest of the 
terms, in order that A^ | /, they should contain at least k factors of AF. So we 
may write that 



/ = /i • A” + /2 • (AF)'= for some /i , /2 G i?. 

Hence / G (A”, (AF)'=). Again it is trivial to verify that both A” and (AF)'= 
are in the inverse of ((AF)""'’’/A”, 1), proving that P^ = (A", (AF)'=). □ 



Remark The above computation is consistent with the proposition that [P] 
generates the divisor class group Cl{Rn,i) of order n: note that 






[((AF)"-^A")] 

^pn—kj 



4 Cryptanalysis of the Scheme 

In their proposal [1], Kim and Moon choose the ring Rn,i for their divisor class 
group construction. Divisor classes are represented by reduced ideals of i?„p. 
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and various ideal computations (product, quotient, ‘divisorial closure’, etc.) are 
done by the Grobner basis method. 

Considered as an abstract group, there is no problem in applying public- 
key primitives based on discrete logarithm problem to the divisor class group 
Cl{Rn,j) for general j, but Kim and Moon restricts their choice to j = 1 in order 
to use the isomorphism = F[X”, XY, Y”] = F[x,y,z]/(z” — xy). They use 
this isomorphism to define and use the concept of reduced ideals which are unique 
representatives of divisor classes. 

In this section we will solve the discrete logarithm problem for Cl{Rn,j)- 

Suppose that we are given a divisorial ideal I. Without loss of generality we 
may assume that / is in fact in Rn,j, that is, an ordinary ideal. We would like 
to compute k satisfying [/] = [P]*^ {k = 0, . . . , n — 1). 

Lemma 4.1. For any ordinary divisorial ideal I, there exists a f G Rnj and 
fc G {0, . . . , n — 1} so that 



f 

J _ pk 

Xn 

Proof. Since [P] generates the divisor class group Cl{Rnj), there exists f,gG 
Rn,j and some fc G {0, . . . , n — 1} so that 

J — l_~pk 

9 

So we need to show that in fact it suffices to take g = X". 

Let d = gcd(/, g). Note that d G F[X, Y] and in general d ^ Rn,j- Then there 
are two mutually prime polynomials f' , g' g¥[X,Y] so that f = d- f , g = d- g' . 
Since X" G P*’, we have 

= —X” GICRC F[X, Yl. 

9 9 

Therefore 9 ' \ f ■ X”. Since /' and g' are mutually prime, g' \ X". Therefore 
g' = X’’ for some r < n. 

Then 



j! jf ^ j^n—r 

“ = ^ • 

Let /" be the polynomial /' • X”“’’. Then f jg = f jX"^ . But then /" = 
f /g- X"^, which is in the quotient field K. Since it is also a polynomial, it follows 
that /" G Rn,j- Therefore 



f" — 

J — A pk 

X" 



is the equation that we were searching for. 



□ 
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Using the above lemma and the fact that gcd(P^) = k, we may solve the 
discrete logarithm problem for Cl{Rn,j) as follows: 

Given an ordinary divisorial ideal I, we have, by the above lemma, / = 
f jX'^ ■ for some / G and k. Then, 

-I = f-1^. 

Taking the greatest common divisors of the both sides, we get 
X^ ■ gcd{I) = f ■ gcd{l^) = f ■ X\ 

Then / = • gcd(/). 

We have to determine the unknown k. But / is in Rnj, and there is a criterion 
for polynomial membership in Rn,j (Proposition 3.1). 

Take any monomial X°“Y^ occuring in the polynomial gcd(/) G F[X, U]. 
Then occurs as a monomial in /. Since / G Rnj, it follows that 

b= {n — k + a)j = {a — k)j (mod n). Therefore we have 

k = a — bj~^ (mod n) 
where j~^ is taken modulo n. 



5 Some Examples 



Example 1. Consider the following prime ideal 
P' = {XY^ , X"^Y^ , X^Y^ , . . . 

By the symmetry of X and Y, P' also generates Cl{Rn,j)- What is the exponent 
k such that [P'] = [P]*? Clearly, gcd(P') = Y. Therefore in order that G 

Rn,j, 1 = (n — k)j (mod n), that is k = n — j~^, so k = n — j~^. 



Example 2. Consider the following ideal 

I = ( X" -I- -I- -i- 

j^n-ly(n-l)i _|_ j^n-2y(n-l)j + (n-l)i^ 



In fact I is simply 

Xn _|_ j^n-ly(n-l)i 

but let’s follow our methodology to see that if fc = 1 indeed comes out. 

Since X” -b = X'^~^{X + y("-i)i), XY^ + yj+("-i)i = 

y("“^)l), we have 

gcd(i) = X -b 

Therefore in order that • gcd(/) be in Rnj, the monomials and 

^n-fcy(n-i)j satisfy the relation b = aj when a and b are exponents 

of X and Y, respectively. The unique solution in this case is fc = 1. Therefore 

[I] = [P]- 
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6 Conclusions 

In this paper we have shown that in case of the divisor class group Cl{Rn,j), 
the representation of elements of the group reveals the exponent with respect a 
generator [P] transparently, thereby proving that the public key cryptosystem 
proposed by Kim and Moon is insecure. 

In general, we suspect that the use of the Grobner basis and large polynomial 
rings in construction of a cryptosystem is at best problematic, because the effi- 
ciency of the ideal computation is only loosely related to the security parameter 
of the divisor class group, thus making efficiency and security assertions about 
the system very difficult. 
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